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PATENT 

ATTORNEY DOCKET NO: 50110/002005 

METHODS AND REAGENTS FOR MODULATING ^ 
CHOLESTEROL LEVELS 
Cross-Reference to Related Applications 
This application claims priority from U.S. Provisional Application No. 
60/124,702, filed March 15, 1999, U.S. Provisional Application No. 60/138,048, 
filed June 8, 1999, U.S. Provisional Application No. 60/139,600, filed June 17, 
1999, and U.S. Provisional Application No. 60/151,977, filed September 1, 1999. 

Background of the Invention 

Low HDL cholesterol (HDL-C), or hypoalphalipoproteinemia, is a blood 
lipid abnormality which correlates with a high risk of cardiovascular disease 
(CVD), in particular coronary artery disease (CAD), but also cerebrovascular 
disease, coronary restenosis, and peripheral vascular disease. HDL, or 'good 
cholesterol' levels are influenced by both environmental and genetic factors. 

Epidemiological studies have consistently demonstrated that plasma HDL- 
C concentration is inversely related to the incidence of CAD. HDL-C levels are a 
strong graded and independent cardiovascular risk factor. Protective effects of an 
elevated HDL-C persist until 80 years of age. A low HDL-C is associated with an 
increased CAD risk even with normal (<5.2 mmol/1) total plasma cholesterol 
levels. Coronary disease risk is increased by 2% in men and 3% in women for 
every 1 mg/dL (0.026 mmol/1) reduction in HDL-C and in the majority of studies 
this relationship is statistically significant even after adjustment for other lipid and 
non-lipid risk factors. Decreased HDL-C levels are the most common lipoprotein 
abnormality seen in patients with premature CAD, Four percent of patients with 



premature CAD with have an isolated form of decreased HDL-C levels with no 
other lipoprotein abnormalities while 25% have low HDL levels with 
accompanying hypertriglyceridemia. 

Even in the face of other dyslipidemias or secondary factors, HDL-C levels 

5 are important predictors of CAD. In a cohort of diabetics, those with isolated low 
HDL cholesterol had a 65% increased death rate compared to diabetics with 
normal HDL cholesterol levels (>0,9 mmol/1). Furthermore, it has been shown 
that even within high risk populations, such as those with familial 
hypercholesterolemia, HDL cholesterol level is an important predictor of CAD. 

10 Low HDL cholesterol levels thus constitute a major, independent, risk for CAD. 

These findings have led to increased attention to HDL cholesterol levels as 
a focus for treatment, following the recommendations of the National Cholesterol 
Education Program. These guidelines suggest that HDL cholesterol values below 
0.9 mmol/1 confer a significant risk for men and women. As such, nearly half of 

15 patients with CAD would have low HDL cholesterol. It is therefore crucial that 
we obtain a better understanding of factors which contribute to this phenotype. In 
view of the fact that pharmacological intervention of low HDL cholesterol levels 
has so far proven unsatisfactory, it is also important to understand the factors that 
regulate these levels in the circulation as this understanding may reveal new 

20 therapeutic targets. 

Absolute levels of HDL cholesterol may not always predict risk of CAD. In 
the case of CETP deficiency, individuals display an increased risk of developing 
CAD, despite increased HDL cholesterol levels. What seems to be important in 
this case is the functional activity of the reverse cholesterol transport pathway, the 

25 process by which intracellular cholesterol is trafficked out of the cell to acceptor 
proteins such as ApoAI or HDL. Other important genetic determinants of HDL 
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cholesterol levels, and its inverse relation with CAD, may reside in the processes 
leading to HDL formation and intracellular cholesterol trafficking and efflux. To 
date, this process is poorly understood, however, and clearly not all of the 
components of this pathway have been identified. Thus, defects preventing proper 
HDL-mediated cholesterol efflux may be important predictors of CAD. Therefore 
it is critical to identify and understand novel genes involved in the intracellular 
cholesterol trafficking and efflux pathways, 

HDL particles are central to the process of reverse cholesterol transport and 
thus to the maintenance of tissue cholesterol homeostasis. This process has 
multiple steps which include the binding of HDL to cell surface components, the 
acquisition of cholesterol by passive absorption, the esterification of this 
cholesterol by LCAT and the subsequent transfer of esterified cholesterol by 
CETP, to VLDL and chylomicron remnants for liver uptake. Each of these steps is 
known to impact the plasma concentration of HDL. 

Changes in genes for ApoAI-CIII, lipoprotein lipase, CETP, hepatic lipase, 
and LCAT all contribute to determination of HDL-C levels in humans. One rare 
form of genetic HDL deficiency is Tangier disease (TD), diagnosed in 
approximately 40 patients world-wide, and associated with almost complete 
absence of HDL cholesterol (HDL-C) levels (listed in OMIM as an autosomal 
recessive trait (OMIM 205400)). These patients have very low HDL cholesterol 
and ApoAI levels, which have been ascribed to hypercatabolism of nascent HDL 
and ApoAI, due to a delayed acquisition of lipid and resulting failure of conversion 
to mature HDL. TD patients accumulate cholesterol esters in several tissues, 
resulting in characteristic features, such as enlarged yellow tonsils, 
hepatosplenomegaly, peripheral neuropathy, and cholesterol ester deposition in the 
rectal mucosa. Defective removal of cellular cholesterol and phospholipids by 



ApoAI as well as a marked deficiency in HDL mediated efflux of intracellular 
cholesterol has been demonstrated in TD fibroblasts. Even though this is a rare 
disorder, defining its molecular basis could identify pathways relevant for 
cholesterol regulation in the general population. The decreased availability of fi-ee 

5 cholesterol for efflux in the surface membranes of cells in Tangier Disease patients 
appears to be due to a defect in cellular lipid metaboHsm or trafficking. 
Approximately 45% of Tangier patients have signs of premature CAD, suggesting 
a strong link between decreased cholesterol efflux, low HDL cholesterol and CAD. 
As increased cholesterol is observed in the rectal mucosa of persons with TD, the 

10 molecular mechanism responsible for TD may also regulate cholesterol adsorption 
from the gastrointestinal (GI) tract. 

A more common form of genetic HDL deficiency occurs in patients who 
have low plasma HDL cholesterol usually below the 5th percentile for age and sex 
(OMIM 10768), but an absence of clinical manifestations specific to Tangier 

15 disease (Marcil et al., Arterioscler. Thromb. Vase. Biol. 19:159-169, 1999; 
Marcil et al., Arterioscler. Thromb. Vase. Biol. 15:1015-1024, 1995). These 
patients have no obvious environmental factors associated with this lipid 
phenotype, and do not have severe hypertriglyceridemia nor have known causes of 
severe HDL deficiency (mutations in ApoAI, LCAT, or LPL deficiency) and are 

20 not diabetic. The pattem of inheritance of this condition is most consistent with a 
Mendehan dominant trait (OMIM 10768). 

The development of drugs that regulate cholesterol metabolism has so far 
progressed slowly. Thus, there is a need for a better understanding of the genetic 
components of the cholesterol efflux pathway. Newly-discovered components can 

25 then serve as targets for drug design. 

Low HDL levels are likely to be due to multiple genetic factors. The use of 



-4- 



pharmacogenomics in the aid of designing treatment tailored to the patient makes 
it desirable to identify polymorphisms in components of the cholesterol efflux 
pathway. An understanding of the effect of these polymorphisms on protein 
function would allow for the design of a therapy that is optimal for the patient. 

5 

Summary of the Invention 
In a first aspect, the invention features a substantially pure ABCl 
polypeptide having ABCl biological activity. Preferably, the ABCl polypeptide is 
human ABCl (e.g., one that includes amino acids 1 to 60 or amino acids 61 to 
10 2261 of SEQ ID NO: 1). In one preferred embodiment, the ABCl polypeptide 
includes amino acids 1 to 2261 of SEQ ID NO: 1. 

Specifically excluded from the polypeptides of the invention are the 
polypeptide having the exact amino acid sequence as GenBank accession number 
CAA10005. 1 and the nucleic acid having the exact sequence as AJ012376.1 . Also 
15 excluded is protein having the exact amino acid sequence as GenBank accession 
number X75926. 

In a related aspect, the invention features a substantially pure ABCl 
polypeptide that includes amino acids 1 to 2261 of SEQ ID NO: 1. 

In another aspect, the invention features a substantially pure nucleic acid 
20 molecule encoding an ABCl polypeptide having ABCl biological activity (e.g., a 
nucleic acid molecule that includes nucleotides 75 to 254 or nucleotides 255 to 
6858 of SEQ ID NO: 2). In one preferred embodiment, the nucleic acid molecule 
includes nucleotides 75 to 6858 of SEQ ID NO: 2. 

In a related aspect, the invention features an expression vector, a cell, or a 
25 non-human mammal that includes the nucleic acid molecule of the invention. 

In yet another aspect, the invention features a substantially pure nucleic acid 
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molecule that includes nucleotides 75 to 254 of SEQ ID NO: 2, nucleotides 255 to 

6858 of SEQ ID NO: 2, or nucleotides 75 to 6858 of SEQ ID NO: 2. 

In still another aspect, the invention features a substantially pure nucleic 

acid molecule that includes at least fifteen nucleotides corresponding to the 5' or 
5 3' untranslated region from a human ABCl gene. Preferably, the 3' untranslated 

region includes nucleotides 7015-7860 of SEQ ID NO: 2. 

In a related aspect, the invention features a substantially pure nucleic acid 

molecule that hybridizes at high stringency to a probe comprising nucleotides 

7015-7860 of SEQ ID NO: 2. 
10 In another aspect, the invention features a method of treating a human 

having low HDL cholesterol or a cardiovascular disease, including administering 

to the human an ABCl polypeptide, or cholesterol-regulating fragment thereof, or 

a nucleic acid molecule encoding an ABCl polypeptide, or cholesterol-regulating 

fragment thereof In a preferred embodiment, the human has a low HDL 
15 cholesterol level relative to normal. Preferably, the ABCl polypeptide is wild-type 

ABCl, or has a mutation increases its stability or its biological activity. A 

preferred biological activity is regulation of cholesterol. 

In a related aspect, the invention features a method of preventing or treating 

cardiovascular disease, including introducing into a human an expression vector 
20 comprising an ABCl nucleic acid molecule operably linked to a promoter and 

encoding an ABCl polypeptide having ABCl biological activity. 

In another related aspect, the invention features a method of preventing or 

ameliorating the effects of a disease-causing mutation in an ABCl gene, including 

introducing into a human an expression vector comprising an ABCl nucleic acid 
25 molecule operably linked to a promoter and encoding an ABCl polypeptide 

having ABCl biological activity. 
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In still another aspect, the invention features a method of treating or 
preventing cardiovascular disease, including administering to an animal (e.g., a 
human) a compound that mimes the activity of wild-type ABCl or modulates the 
biological activity of ABCl. 

One preferred cardiovascular disease that can be treated using the methods 
of the invention is coronary artery disease. Others include cerebrovascular disease 
and peripheral vascular disease. 

The discovery that the ABCl gene and protein are involved in cholesterol 
transport that affects serum HDL levels allows the ABCl protein and gene to be 
used in a variety of diagnostic tests and assays for identification of HDL- 
increasing or CVD-inhibiting drugs. In one family of such assays, the ability of 
domains of the ABCl protein to bind ATP is utilized; compounds that enhance this 
binding are potential HDL-increasing drugs. Similarly, the anion transport 
capabilities and membrane pore- forming functions in cell membranes can be used 
for drug screening. 

ABCl expression can also serve as a diagnostic tool for low HDL or CVD; 
determination of the genetic subtyping of the ABCl gene sequence can be used to 
subtype low HDL individuals or families to determine whether the low HDL 
phenotype is related to ABCl function. This diagnostic process can lead to the 
tailoring of drug treatments according to patient genotype (referred to as 
pharmacogenomics), including prediction of the patient's response (e.g., increased 
or decreased efficacy or undesired side effects upon administration of a compound 
or drug. 

Antibodies to an ABCl polypeptide can be used both as therapeutics and 
diagnostics. Antibodies are produced by immunologically challenging a B-cell- 
containing biological system, e.g., an animal such as a mouse, with an ABCl 



polypeptide to stimulate production of anti-ABCl protein by the B-cells, followed 
by isolation of the antibody from the biological system. Such antibodies can be 
used to measure ABCl polypeptide in a biological sample such as serum, by 
contacting the sample with the antibody and then measuring immune complexes as 
5 a measure of the ABCl polypeptide in the sample. Antibodies to ABCl can also 
be used as therapeutics for the modulation of ABCl biological activity. 

Thus, in another aspect, the invention features a purified antibody that 
specifically binds to ABCl. 

In yet another aspect, the invention features a method for determining 

10 whether a candidate compound modulates ABCl biological activity, comprising: 
(a) providing an ABCl polypeptide; (b) contacting the ABCl polypeptide with the 
candidate compound; and (c) measuring ABCl biological activity, wherein altered 
ABCl biological activity, relative to an ABCl polypeptide not contacted with the 
compound, indicates that the candidate compound modulates ABCl biological 

15 activity. Preferably, the ABCl polypeptide is in a cell or is in a cell-free assay 
system. 

In still another aspect, the invention features a method for determining 
whether a candidate compound modulates ABCl expression. The method includes 
(a) providing a nucleic acid molecule comprising an ABCl promoter operably 

20 linked to a reporter gene; (b) contacting the nucleic acid molecule with the 

candidate compound; and (c) measuring reporter gene expression, wherein altered 
reporter gene expression, relative to a nucleic acid molecule not contacted with the 
compound, indicates that the candidate compound modulates ABCl expression. 
In another aspect, the invention features a method for determining whether 

25 candidate compound is useful for modulating cholesterol levels, the method 
including the steps of: (a) providing an ABCl polypeptide; (b) contacting the 
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polypeptide with the candidate compound; and (c) measuring binding of the ABCl 
polypeptide, wherein binding of the ABCl polypeptide indicates that the candidate 
compound is useful for modulating cholesterol levels. 

In a related aspect, the invention features method for determining whether a 
5 candidate compound mimics ABCl biological activity. The method includes (a) 

providing a cell that is not expressing an ABCl polypeptide; (b) contacting the cell 
with the candidate compound; and (c) measuring ABCl biological activity of the 
cell, wherein altered ABCl biological activity, relative to a cell not contacted with 
the compound, indicates that the candidate compound modulates ABCl biological 

10 activity. Preferably, the cell has mi ABCl null mutation. In one preferred 

embodiment, the cell is in a mouse or a chicken (e.g., a WHAM chicken) in which 
its ABCl gene has been mutated. 

In still another aspect, the invention features a method for determining 
whether a candidate compound is useful for the treatment of low HDL cholesterol. 

15 The method includes (a) providing an ABC transporter (e.g., ABCl); (b) 

contacting the transporter with the candidate compound; and (c) measuring ABC 
transporter biological activity, wherein increased ABC transporter biological 
activity, relative to a transporter not contacted with the compound, indicates that 
the candidate compound is useful for the treatment of low HDL cholesterol. 

20 Preferably the ABC transporter is in a cell or a cell free assay system. 

In yet another aspect, the invention features a method for determining 
whether candidate compound is useful for modulating cholesterol levels. The 
method includes (a) providing a nucleic acid molecule comprising an ABC 
transporter promoter operably linked to a reporter gene; (b) contacting the nucleic 

25 acid molecule with the candidate compound; and (c) measuring expression of the 
reporter gene, wherein increased expression of the reporter gene, relative to a 
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nucleic acid molecule not contacted with the compound, indicates that the 
candidate compound is useful for modulating cholesterol levels. 

In still another aspect, the invention features a method for determining 
whether a candidate compound increases the stability or decreases the regulated 
catabolism of an ABC transporter polypeptide. The method includes (a) providing 
an ABC transporter polypeptide; (b) contacting the transporter with the candidate 
compound; and (c) measuring the half-life of the ABC transporter polypeptide, 
wherein an increase in the half-life, relative to a transporter not contacted with the 
compound, indicates that the candidate compound increases the stability or 
decreases the regulated catabolism of an ABC transporter polypeptide. Preferably 
the ABC transporter is in a cell or a cell free assay system. 

In a preferred embodiment of the screening methods of the present 
invention, the cell is in an animal. The preferred ABC transporters are ABCl, 
ABC2, ABCR, and ABC8, and the preferred biological activity is transport of 
cholesterol (e.g., HDL cholesterol or LDL cholesterol) or interleukin-1, or is 
binding or hydrolysis of ATP by the ABCl polypeptide. 

Preferably, the ABCl polypeptide used in the screening methods includes 
amino acids 1-60 of SEQ ID NO: 1. Altematively, the ABCl polypeptide can 
include a region encoded by a nucleotide sequence that hybridizes under high 
stringency conditions to nucleotides 75 to 254 of SEQ ID NO: 2. 

In another aspect, the invention features a method for determining whether 
a patient has an increased risk for cardiovascular disease. The method includes 
determining whether an ABCl gene of the patient has a mutation, wherein a 
mutation indicates that the patient has an increased risk for cardiovascular disease. 

In related aspect, the invention features a method for determining whether a 
patient has an increased risk for cardiovascular disease. The method includes 
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determining whether an ABCl gene of the patient has a polymorphism, wherein a 
polymorphism indicates that the patient has an increased risk for cardiovascular 
disease. 

In another aspect, the invention features a method for determining whether 
a patient has an increased risk for cardiovascular disease. The method includes 
measuring ABCl biological activity in the patient, wherein increased or decreased 
levels in the ABCl biological activity, relative to normal levels, indicates that the 
patient has an increased risk for cardiovascular disease. 

In still another aspect, the invention features a method for determining 
whether a patient has an increased risk for cardiovascular disease. The method 
includes measuring ABCl expression in the patient, wherein decreased levels in 
the ABCl expression relative to normal levels, indicates that the patient has an 
increased risk for cardiovascular disease. Preferably, the ABCl expression is 
determined by measuring levels of ABCl polypeptide or ABCl RNA. 

In another aspect, the invention features a non-human mammal having a 
transgene comprising a nucleic acid molecule encoding a mutated ABCl 
polypeptide. In one embodiment, the mutation is a dominant-negative mutation. 

In a related aspect, the invention features a non-human mammal, having a 
transgene that includes a nucleic acid molecule encoding an ABCl polypeptide 
having ABCl biological activity. 

In another related aspect, the invention features a cell from a non-human 
mammal having a transgene that includes a nucleic acid molecule encoding an 
ABCl polypeptide having ABCl biological activity. 

In still another aspect, the invention features a method for determining 
whether a candidate compound decreases the inhibition of a dominant-negative 
ABCl polypeptide. The method includes (a) providing a cell expressing a 



-11- 



dominant-negative ABCl polypeptide; (b) contacting the cell with the candidate 
compound; and (c) measuring ABCl biological activity of the cell, wherein an 
increase in the ABCl biological activity, relative to a cell not contacted with the 
compound, indicates that the candidate compound decreases the inhibition of a 
dominant-negative ABCl polypeptide. 

By "polypeptide" is meant any chain of more than two amino acids, 
regardless of post-translational modification such as glycosylation or 
phosphorylation. 

By "substantially identical" is meant a polypeptide or nucleic acid 
exhibiting at least 50%, preferably 85%, more preferably 90%, and most 
preferably 95% identity to a reference amino acid or nucleic acid sequence. For 
polypeptides, the length of comparison sequences will generally be at least 16 
amino acids, preferably at least 20 amino acids, more preferably at least 25 amino 
acids, and most preferably 35 amino acids. For nucleic acids, the length of 
comparison sequences will generally be at least 50 nucleotides, preferably at least 
60 nucleotides, more preferably at least 75 nucleotides, and most preferably 
110 nucleotides. 

Sequence identity is typically measured using sequence analysis software 
with the default parameters specified therein (e.g., Sequence Analysis Software 
Package of the Genetics Computer Group, University of Wisconsin Biotechnology 
Center, 1710 University Avenue, Madison, WI 53705), This software program 
matches similar sequences by assigning degrees of homology to various 
substitutions, deletions, and other modifications. Conservative substitutions 
typically include substitutions within the following groups: glycine, alanine, 
valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; 
serine, threonine; lysine, arginine; and phenylalanine, tyrosine. 
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By "high stringency conditions" is meant hybridization in 2X SSC at 40 ""C 
with a DNA probe length of at least 40 nucleotides. For other definitions of high 
stringency conditions, see F. Ausubel et al., Current Protocols in Molecular 
Biology, pp. 6.3.1-6.3.6, John Wiley & Sons, New York, NY, 1994, hereby 
incorporated by reference. 

By "substantially pure polypeptide" is meant a polypeptide that has been 
separated from the components that naturally accompany it. Typically, the 
polypeptide is substantially pure when it is at least 60%, by weight, free from the 
proteins and naturally-occurring organic molecules with which it is naturally 
associated. Preferably, the polypeptide is an ABCl polypeptide that is at least 
75%, more preferably at least 90%, and most preferably at least 99%), by weight, 
pure. A substantially pure ABCl polypeptide may be obtained, for example, by 
extraction from a natural source (e.g., a pancreatic cell), by expression of a 
recombinant nucleic acid encoding a ABCl polypeptide, or by chemically 
synthesizing the protein. Purity can be measured by any appropriate method, e.g., 
by column chromatography, polyacrylamide gel electrophoresis, or HPLC 
analysis. 

A polypeptide is substantially free of naturally associated components when 
it is separated from those contaminants that accompany it in its natural state. Thus, 
a polypeptide which is chemically synthesized or produced in a cellular system 
different from the cell from which it naturally originates will be substantially free 
from its naturally associated components. Accordingly, substantially pure 
polypeptides include those which naturally occur in eukaryotic organisms but are 
synthesized in E. coli or other prokaryotes. 

By "substantially pure nucleic acid" is meant nucleic acid that is free of the 
genes which, in the naturally-occurring genome of the organism from which the 
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nucleic acid of the invention is derived, flank the nucleic acid. The term therefore 
includes, for example, a recombinant nucleic acid that is incorporated into a 
vector; into an autonomously replicating plasmid or virus; into the genomic 
nucleic acid of a prokaryote or a eukaryote cell; or that exists as a separate 
5 molecule (e.g., a cDNA or a genomic or cDNA fragment produced by PGR or 
restriction endonuclease digestion) independent of other sequences. It also 
includes a recombinant nucleic acid that is part of a hybrid gene encoding 
additional polypeptide sequence. 

By "modulates" is meant increase or decrease. Preferably, a compound that 

10 modulates cholesterol levels (e.g., HDL-cholesterol levels, LDL-cholesterol levels, 
or total cholesterol levels), or ABCl biological activity, expression, stability, or 
degradation does so by at least 10%, more preferably by at least 25%, and most 
preferably by at least 50%. 

By "purified antibody" is meant antibody which is at least 60%, by weight, 

15 free from proteins and naturally occurring organic molecules with which it is 

naturally associated. Preferably, the preparation is at least 75%, more preferably 
90%, and most preferably at least 99%, by weight, antibody. A purified antibody 
may be obtained, for example, by affinity chromatography using recombinantly- 
produced protein or conserved motif peptides and standard techniques. 

20 By "specifically binds" is meant an antibody that recognizes and binds to, 

for example, a human ABCl polypeptide but does not substantially recognize and 
bind to other non-ABCl molecules in a sample, e.g., a biological sample, that 
naturally includes protein. A preferred antibody binds to the ABCl polypeptide 
sequence of Fig. 9A (SEQ ID NO: 1). 

25 By "polymorphism" is meant that a nucleotide or nucleotide region is 

characterized as occurring in several different forms. A "mutation" is a form of a 
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polymorphism in which the expression level, stability, function, or biological 
activity of the encoded protein is substantially altered. 

By "ABC transporter" or "ABC polypeptide" is meant any transporter that 
hydrolyzes ATP and transports a substance across a membrane. Preferably, an 
ABC transporter polypeptide includes an ATP Binding Cassette and a 
transmembrane region. Examples of ABC transporters include, but are not limited 
to, ABCl, ABC2, ABCR, and ABC8. 

By "ABCl polypeptide" is meant a polypeptide having substantial identity 
to an ABCl polypeptide having the amino acid sequence of SEQ ID NO: 1. 

By "ABC biological activity" or "ABCl biological activity" is meant 
hydrolysis or binding of ATP, transport of a compound (e.g., cholesterol, 
interleukin-1) or ion across a membrane, or regulation of cholesterol or 
phospholipid levels (e.g., either by increasing or decreasing HDL-cholesterol or 
LDL-cholesterol levels). 

The invention provides screening procedures for identifying therapeutic 
compounds (cholesterol-modulating or anti-CVD pharmaceuticals) which can be 
used in human patients. Compounds that modulate ABC biological activity (e.g., 
ABCl biological activity) are considered useful in the invention, as are compounds 
that modulate ABC concentration, protein stability, regulated catabolism, or its 
ability to bind other proteins or factors. In general, the screening methods of the 
invention involve screening any number of compounds for therapeutically active 
agents by employing any number of in vitro or in vivo experimental systems. 
Exemplary methods useful for the identification of such compounds are detailed 
below. 

The methods of the invention simplify the evaluation, identification and 
development of active agents for the treatment and prevention of low HDL and 
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CVD. In general, the screening methods provide a facile means for selecting 
natural product extracts or compounds of interest from a large population which 
are further evaluated and condensed to a few active and selective materials. 
Constitutes of this pool are then purified and evaluated in the methods of the 
invention to determine their HDL-raising or anti-CVD activities or both. 

Other features and advantages of the invention will be apparent from the 
following description of the preferred embodiments thereof, and from the claims. 

Brief Description of the Drawings 
Figs. 1 A and IB are schematic illustrations showing two pedigrees with 
Tangier Disease, (TD-1 and TD-2). Square and circle symbols represent males 
and females, respectively. Diagonal lines are placed through the symbols of all 
deceased individuals. A shaded symbol on both alleles indicates the probands with 
Tangier Disease. Individuals with half shaded symbols have HDL-C levels at or 
below the 10th percentile for age and sex, while those with quarter shaded symbols 
have HDL-C between the 1 1th and 20th percentiles. 

Each individual's ID number, age at the time of lipid measurement, 
triglyceride level and HDL cholesterol level followed by their percentile ranking 
for age and sex are listed below the pedigree symbol. Markers spanning the 
9q3 1 . 1 region are displayed to the left of the pedigree. The affected allele is 
represented by the darkened bars which illustrate the mapping of the limits of the 
shared haplotype region as seen in Fig. 3. Parentheses connote inferred marker 
data, questions marks indicate unknown genotypes, and large arrows show the 
probands. 

Fig. IC shows ApoAI (10 ^g/mL) -mediated cellular cholesterol efflux in 
control fibroblasts (n=5, normalized to 100%) and two subjects with Tangier 
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disease (TD). Cells were ^H-cholesterol (0.2 |LiCi/mL) labeled during growth and 
cholesterol (20 |Lig/mL) loaded in growth arrest. Cholesterol efflux is determined 
as mediuni/(-^H cell + medium) 

Fig. 2A - 2D are schematic illustrations showing four French Canadian 
pedigrees with FHA (FHA-1 to -4). The notations are as in Fig. 1. Exclamation 
points on either side of a genotype (as noted in Families FHA-3 and FHA-4) are 
used when the marker data appears to be inconsistent due to potential 
microsatellite repeat expansions. A bar that becomes a single thin line suggests 
that the haplotype is indeterminate at that marker. 

Figs. 3 A - 3E are a schematic illustration showing a genetic and physical 
map of 9q31 spanning 35 cM. Fig. 3 A: YACs from the region of 9q22-34 were 
identified and a YAC contig spanning this region was constructed. Fig. 3B: A 
total of 22 polymorphic CA microsatellite markers were mapped to the contig and 
used in haplotype analysis in TD-1 and TD-2. Fig 3C: The mutant haplotypes for 
probands in TD-1 and -2 indicate a significant region of homozygosity in TD-2, 
while the proband in TD-1 has 2 different mutant haplotypes. The candidate 
region can be narrowed to the region of homozygosity for CA markers in proband 
2. A critical crossover at D9S1690 in TD-1 (A)* also provides a centromeric 
boundary for the region containing the gene. Three candidate genes in this region 
(ABCl, LPA'R and RGS-3) are shown. Fig. 3D: Meiotic recombinations in the 
FHA families (A-H) refine the minimal critical region to 1.2 cM between D9S277 
and D9S1866. The heterozygosity of the TD-2 proband at D9S127, which ends a 
continuous region of homozygosity in TD-2, further refines the region to less than 
1 cM. This is the region to which ABCl has been mapped. Fig. 3E: Isolated YAC 
DNA and selected markers from the region were used to probe high-density BAC 
grid filters, selecting BACs which via STS-content mapping produced an 800 Kb 
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contig. Four B ACs containing ABCl were sequenced using high-throughput 
methods. 

Fig. 4A shows sequence of one mutation in family TD-1. Patient III-Ol is 
heterozygous for a T to C transition at nucleotide 4503 of the cDNA; the control is 
5 homozygous for T at this position. This mutation corresponds to a cysteine to 
arginine substitution in the ABCl protein (C1477R). 

Fig 4B shows the amino acid sequence conservation of residue 1477 in 
mouse and human, but not a related C elegans gene. A change from cysteine to 
arginine likely has an important effect on the protein secondary and tertiary 
10 structure, as noted by its negative scores in most substitution matrices (Schuler et 
al., A Practical Guide to the Analysis of Genes and Proteins, eds. Baxevanis, A.D. 
& Ouellette, B.F.F. 145:171, 1998). The DNA sequences of the normal and 
mutant genes are shown above and below the amino acid sequences, respectively. 
Fig. 4C shows the segregation of the T4503C mutation in TD-1. The 
15 presence of the T4503C mutation (+) was assayed by restriction enzyme digestion 
with Hgal, which cuts only the mutant (C) allele ( 1 ). Thus, in the absence of the 
mutation, only the 194 bp PGR product (amplified between ^ and ^) is observed, 
while in its presence the PGR product is cleaved into fragments of 134 bp and 60 
bp. The proband (individual IILOl) was observed to be heterozygous for this 
20 mutation (as indicated by both the 194 bp and 134 bp bands), as were his daughter, 
father, and three patemal cousins. A fourth cousin and three of the father's 
siblings were not carriers of this mutation. 

Fig. 4D shows Northem blot analysis with probes spanning the complete 
ABGl gene reveal the expected -8 Kb transcript and, in addition, a -3.5 kb 
25 truncated transcript only seen in the proband TD-1 and not in TD-2 or control. 

This was detected by probes spanning exons 1-49 (a), 1-41 (b), 1-22 (c), and 23-29 
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(d), but not with probes spanning exons 30-41 (e) or 42-49 (f). 

Fig. 5 A shows the sequence of the mutation in family TD-2. Patient IV- 10 
is homozygous for an A to G transition at nucleotide 1864 of the cDNA (SEQ ID 
NO: 2); the control is homozygous for A at this position. This mutation 
corresponds to a glutamine to arginine substitution in the ABCl protein (Q597R), 

Fig. 5B shows that the glutamine amino acid, which is mutated in the TD-2 
proband, is conserved in human and mouse ABCl as well as in an ABC 
orthologue from C. elegans, revealing the specific importance of this residue in the 
structure/function of this ABC protein in both worms and mammals. The DNA 
sequences of the normal and mutant proteins are shown above and below the 
amino acid sequences, respectively. 

Fig. 5C shows the segregation of the A1864G mutation in TD-2. The 
presence of the A1864G mutation (indicated by +) was assayed by restriction 
enzyme digestion withAciL The 360 bp PCR product has one invariant ^czl 
recognition site (1), and a second one is created by the A1864G mutation. The 
wild-type allele is thus cleaved to fragments of 215 bp and 145 bp, while the 
mutant allele (G-allele) is cleaved to fragments of 185 bp, 145 bp and 30 bp. The 
proband (individual IV- 10), the product of a consanguineous mating, was 
homozygous for the A1864G mutation (+/+), as evidenced by the presence of only 
the 185 bp and 145 bp bands, while four other family members for whom DNA 
was tested are heterozygous carriers of this mutation (both the 215 bp and 185 bp 
fragments were present). Two unaffected individuals (-/-), with only the 215 bp 
and 145 bp bands are shown for comparison. 

Fig. 6 A shows a sequence of the mutation in family FHA-1. Patient III-Ol 
is heterozygous for a deletion of nucleotides 2151-2153 of the cDNA (SEQ ID 
NO: 2). This deletion was detected as a superimposed sequence starting at the first 
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nucleotide after the deletion. This corresponds to deletion of leucine 693 in the 
ABCl protein (SEQ ID NO: 1). 

Fig. 6B is an alignment of the human and mouse wild-type amino acid 
sequences, showing that the human and mouse sequences are identical in the 
vicinity of AL693. L693 is also conserved in C elegans. This highly conserved 
residue lies within a predicted transmembrane domain. The DNA sequences of the 
normal and mutant proteins are shown above and below the amino acid sequences, 
respectively. 

Fig. 6C shows segregation of the AL693 mutation in FHA-1, as assayed by 
Earl restriction digestion. Two invariant Earl restriction sites (indicated by I) are 
present within the 297 bp PGR product located between the horizontal arrows (^-) 
while a third site is present in the wild-type allele only. The presence of the 
mutant allele is thus distinguished by the presence of a 210 bp fragment (+), while 
the normal allele produces a 151 bp fragment (-). The proband of this family 
(III.Ol) is heterozygous for this mutation, as indicated by the presence of both the 
210 and 151 bp bands. 

Fig. 6D shows a sequence of the mutation in family FHA-3. Patient III-Ol 
is heterozygous for a deletion of nucleotides 5752-5757 of the cDNA (SEQ ID 
NO: 2). This deletion was detected as a superimposed sequence starting at the first 
nucleotide after the deletion. This corresponds to deletion of glutamic acid 1893 
and aspartic acid 1894 in the ABCl protein (SEQ ID NO: 1). 

Fig. 6E is an alignment of the human and mouse wild-type amino acid 
sequences, showing that the human and mouse sequences are identical in the 
vicinity of A5752-5757. This region is highly conserved in C elegans. The DNA 
sequences of the normal and mutant proteins are shown above and below the 
amino acid sequences, respectively. 
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Fig. 6F shows a sequence of the mutation in family FHA-2. Patient III-Ol 
is heterozygous for a for a C to T transition at nucleotide 6504 of the cDNA (SEQ 
ID NO: 2). This alteration converts an arginine at position 2144 of SEQ ID NO: 1 
to a STOP codon, causing truncation of the last 118 amino acids of the ABCl 
5 protein. 

Figs. 7 A and 7B show cholesterol efflux from human skin fibroblasts 
treated with ABCl antisense oligonucleotides. Fibroblasts from a control subject 
were labeled with cholesterol (0.2 |iCi/mL) during growth for 48 hours and 
transfected with 500 nM ABCl antisense AN-1 (5'-GCA GAG GGC ATG GOT 

10 TTA TTT G-3'; SEQ ID NO: 3) with 7.5 |Lig lipofectin for 4 hours. Following 

transfection, cells were cholesterol loaded (20 |ag/mL) for 12 hours and allowed to 
equilibrate for 6 hours. Cells were either then harvested for total RNA and 10 |xg 
was used for Northern blot analysis. Cholesterol efflux experiments were carried 
out as described herein. Fig. 7 A: AN-1 was the oligonucleotide that resulted in a 

15 predictable decrease in ABCl RNA transcript levels. Fig. 7B: A double antisense 
transfection method was used. In this method, cells were labeled and transfected 
with AN-1 as above, allowed to recover for 20 hours, cholesterol loaded for 24 
hours, and then re-transfected with AN-1 . Twenty hours after the second 
transfection, the cholesterol efflux as measured. A -50% decrease in ABCl 

20 transcript levels was associated with a significant decrease in cholesterol efflux 
intermediate between that seen in wild-type and TD fibroblasts. 

Fig. 7C shows show cholesterol efflux from human skin fibroblasts treated 
with antisense oligonucleotides directed to the region encoding the amino-terminal 
60 amino acids. Note that the antisense oligonucleotide AN-6, which is directed to 

25 the previously unrecognized translation start site, produces a substantial decrease 
in cellular cholesterol efflux. 
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Fig. 8 is a schematic illustration showing predicted topology, mutations, 
and polymorphisms of ABCl in Tangier disease and FHA. The two 
transmembrane and ATP binding domains are indicated. The locations of 
mutations are indicated by the arrows with the amino acid changes, which are 
predicted from the human v^^Ci cDNA sequence. These mutations occur in 
different regions of the ABCl protein. 

Fig. 9 A shows the amino acid sequence of the human ABCl protein (SEQ 
ID NO: 1). 

Figs. 9B - 9E show the nucleotide sequence of the human ^5C7 cDNA 
(SEQ ID NO: 2). 

Fig. 10 shows the 5' and 3' nucleotide sequences suitable for use as 5' and 
3' PGR primers, respectively, for the amplification of the indicated ABCl exon. 

Fig. 1 1 shows a summary of alterations found in ABCl, including 
sequencing errors, mutations, and polymorphisms. 

Fig. 12 shows a series of genomic contigs (SEQ ID NOS. 14-29) containing 
the ABCl promoter (SEQ ID NO: 14), as well as exons 1-49 (and flanking intronic 
sequence) of ABCl . The exons (capitalized letters) are found in the contigs as 
follows: SEQ ID NO: 14-exon 1; SEQ ID NO: 15-exon 2; SEQ ID NO: 16-exon 
3; SEQ ID NO: 17-exon 4; SEQ ID NO: 18-exon 5; SEQ ID NO: 19-exon 6; 
SEQ ID NO: 20-exons 7 and 8; SEQ ID NO: 21-exons 9 through 22; SEQ ID 
NO: 22--exons 23 through 28; SEQ ID NO: 23-exon 29; SEQ ID NO: 24--exons 
30 and 31; SEQ ID NO: 25-exon 32; SEQ ID NO: 26--exons 33 through 36; SEQ 
ID NO: 27-exons 37 through 41; SEQ ID NO: 28-exons 42-45; SEQ ID NO: 29- 
exons 46-49. 

Fig. 13 is a series of illustrations showing that the amino-terminal 60 amino 
acid region of ABCl is protein-coding. Lysates of normal human fibroblasts were 
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immunoblotted in parallel with a rabbit polyclonal antibody to amino acids 1-20 of 
human ABCl (1); a rabbit polyclonal antibody to amino acids 1430-1449 of 
human ABCl (2); and a mouse monoclonal antibody to amino acids 2236-2259 of 
human ABCl . The additional bands detected in lane 2 may be due to a lack of 
specificity of that antibody or the presence of degradation products of ABCl. 

Fig. 14 is a schematic illustration showing that the WHAM chicken 
contains a non-conservative substitution (G265A) resulting in an amino acid 
change (E89K). 

Fig. 1 5 is a schematic illustration showing that the mutation in the WHAM 
chicken is at an amino acid that is conserved among human, mouse, and chicken. 

Fig. 16 show a summary of locations of consensus transcription factor 
binding sites in the human ABCl promoter (nucleotides 1-8238 of SEQ ID NO: 
14). The abbreviations are as follows: PPRE^peroxisome proliferator-activated 
receptor. SRE^steroid response element-binding protein site. ROR^RAR-related 
orphan receptor. 

Detailed Description 
Genes play a significant role influencing HDL levels. Tangier disease (TD) 
was the first reported genetic HDL deficiency. The molecular basis for TD is 
unknown, but has been mapped to 9q31 in three families. We have identified two 
additional probands and their families, and confirmed linkage and refined the locus 
to a limited genomic region. Mutations in the ABCl gene accounting for all four 
alleles in these two families were detected. A more fi-equent cause of low HDL 
levels is a distinct disorder, familial HDL deficiency (FHA). On the basis of 
independent linkage, meiotic recombinants and disease associated haplotypes, 
FHA was localized to a small genomic region encompassing the ABCl gene. A 
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mutation in a conserved residue in ABCl segregated with FHA, Antisense 
reduction of the ABCl transcript in fibroblasts was associated with a significant 
decrease in cholesterol efflux. 

Cholesterol is normally assembled with intracellular lipids and secreted, but 
in TD the process is diverted and cholesterol is degraded in lysosomes. This 
disturbance in intracellular trafficking of cholesterol results in an increase in 
intracellular cholesterol ester accumulation associated with morphological changes 
of lysosomes and the Golgi apparatus and cholesteryl ester storage in histiocytes, 
Schwann cells, smooth muscle cells, mast cells and fibroblasts. 

The clinical and biochemical heterogeneity in patients with TD has led to 
the possibility that genetic heterogeneity may also underlie this disorder. 
Considering this, we initially performed Hnkage analysis on these two families of 
different ancestries (TD-1 is Dutch, TD-2 is British; Frohlich et ah, Clin. Invest. 
Med. 10:377-382, 1987) and confirmed that the genetic mutations underlying TD 
in these families were localized to the same 9q3 1 region, to which a large family 
with TD had been assigned (Rust et al., Nature Genetics 20:96-98, 1998). Detailed 
haplotype analysis, together with the construction of a physical map, refined the 
localization of this gene. Mutations in the ABCl gene were found in TD. 

FHA is much more common than TD, although its precise frequency is not 
known. While TD has been described to date in only 40 families, we have 
identified more than 40 FHA families in the Netherlands and Quebec alone. After 
initial suggestions of linkage to 9q31, thirteen polymorphic markers spanning 
approximately 10 cM in this region were typed and demonstrated the highest LOD 
score at D9S277. Analysis of the homozygosity of markers in the TD-2 proband, 
who was expected to be homozygous for markers close to TD due to his parents' 
consanguinity, placed the TD gene distal to D95127. Combined genetic data from 
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TD and FHA families pointed to the same genomic segment spanning 
approximately 1,000 kb between D9S127 and D9S1866. The ABCl transporter 
gene was contained within the minimal genomic region. RT-PCR analysis in one 
family demonstrated a deletion of leucine at residue 693 (A693) in the first 
transmembrane domain of ABCl, which segregated with the phenotype of HDL 
deficiency in this family, 

ABCl is part of the ATP-binding cassette (ABC transporter) superfamily, 
which is involved in energy-dependent transport of a wide variety of substrates 
across membranes (Dean et al, Curr. Opin. Gen. Dev. 5:779-785, 1995). These 
proteins have characteristic motifs conserved throughout evolution which 
distinguish this class of proteins from other ATP binding proteins. In humans 
these genes essentially encode two ATP binding segments and two transmembrane 
domains (Dean et ah, Curr. Opin. Gen. Dev. 5:779-785, 1995). We have now 
shown that the ABCl transporter is crucial for intracellular cholesterol transport. 

We have demonstrated that reduction of the ABCl transcript using 
oligonucleotide antisense approaches results in decreased efflux, clearly 
demonstrating the link between alterations in this gene and its functional effects. 
TD and FHA now join the growing list of genetic diseases due to defects in the 
ABC group of proteins including cystic fibrosis (Zielenski, et al., Annu. Rev. 
Genet. 29:777-807, 1995), adrenoleukodystrophy (Mosser et ah. Nature 361: 726- 
730, 1993), Zellweger syndrome (Gartner et al, Nat. Genet. 1:23, 1992), 
progressive familial intrahepatic cholestatis (Bull et al., Nat. Genet. 18:219-224, 
1998), and different eye disorders including Stargardt disease (AUikmets et al., 
Nat. Genet. 15:236-246, 1997), autosomal recessive retinitis pigmentosa (AUikmets 
et al.. Science 277:1805-1807, 1997), and cone-rod dystrophy (Cremers et al.. 
Hum. Mol. Genet. 7:355-362, 1998). 
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Patients with TD have been distinguished from patients with FHA on the 
basis that Tangier disease was an autosomal recessive disorder (OMIM 20540) 
while FHA is inherited as an autosomal dominant trait (OMIM 10768). 
Furthermore, patients with TD have obvious evidence for intracellular cholesterol 
accumulation which is not seen in FHA patients. It is now evident that 
heterozygotes for TD do have reduced HDL levels and that the same mechanisms 
underlie the HDL deficiency and cholesterol efflux defects seen in heterozygotes 
for TD as well as FHA. Furthermore, the more severe phenotype in TD represents 
loss of function from both alleles of the ABCl gene. 

ABCl is activated by protein kinases, presumably via phosphorylation, 
which also provides one explanation for the essential role of activation of protein 
kinase C in promoting cholesterol efflux (Drobnick et al., Arterioscler. Thromb. 
Vase. Biol. 15: 1369-1377, 1995). Brefeldin, which inhibits trafficking between 
the endoplasmic reticulum and the Golgi, significantly inhibits cholesterol efflux, 
essentially reproducing the effect of mutations in ABCl, presumably through the 
inhibition of ABCl biological activity. This finding has significance for the 
understanding of mechanisms leading to premature atherosclerosis. TD 
homozygotes develop premature coronary artery disease, as seen in the proband of 
TD-1 (III-Ol) who had evidence for coronary artery disease at 38 years. This is 
particular noteworthy as TD patients, in addition to exhibiting significantly 
reduced HDL, also have low LDL cholesterol, and yet they develop atherosclerosis 
despite this. This highlights the importance of HDL intracellular transport as an 
important mechanism in atherogenesis. There is significant evidence that 
heterozygotes for TD are also at increased risk for premature vascular disease 
(Schaefer et al., Ann. Int. Med. 93:261-266, 1980; Serfaty-Lacrosniere et al. 
Atherosclerosis 107:85-98, 1994). There is also preliminary evidence for 
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premature atherosclerosis in some probands with FHA (Fig. 2B), e,g,, the proband 
in FHA-2 (III-Ol) had a coronary artery bypass graft at 46 years while the proband 
in FHA-3 (Fig. 2C) had evidence for CAD around 50 years of age. The TD-1 
proband had more severe efflux deficiency than the TD-2 proband (Fig. IC). 
Interestingly, the TD-2 proband had no evidence for CAD by 62 when he died of 
unrelated causes, providing preliminary evidence for a relationship between the 
degree of cholesterol efflux (mediated in part by the nature of the mutation) and 
the likelihood of atherosclerosis. 

The ABCl gene plays a crucial role in cholesterol transport and, in 
particular, intracellular cholesterol trafficking in monocytes and fibroblasts. It also 
appears to play a significant role in other tissues such as the nervous system, GI 
tract, and the cornea. Completely defective intracellular cholesterol transport 
results in peripheral neuropathy, comeal opacities, and deposition of cholesterol 
esters in the rectal mucosa. 

HDL deficiency is heterogeneous in nature. The delineation of the genetic 
basis of TD and FHA underlies the importance of this particular pathway in 
intracellular cholesterol transport, and its role in the pathogenesis of 
atherosclerosis. Unraveling of the molecular basis for TD and FHA defines a key 
step in a poorly defined pathway of cholesterol efflux firom cells and could lead to 
new approaches to treatment of patients with HDL deficiency in the general 
population. 

HDL has been implicated in numerous other biological processes, including 
but not limited to: prevention of lipoprotein oxidation; absorption of endotoxins; 
protection against Trypanosoma brucei infection; modulation of endothelial cells; 
and prevention of platelet aggregation (see Genest et al., J. Invest. Med. 47: 31-42, 
1999, hereby incorporated by reference). Any compound that modulates HDL 
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levels may be useful in modulating one or more of the foregoing processes. The 
present discovery that ABCl functions to regulate HDL levels links, for the first 
time, ABCl with the foregoing processes. 

The following examples are to illustrate the invention. They are not meant 
to limit the invention in any way. 

Analysis of TP Families 

Studies of cholesterol efflux 

Both probands had evidence of marked deficiency of cholesterol efflux 
similar to that previously demonstrated in TD patients (Fig. IC). TD-1 is of Dutch 
descent while TD-2 is of British descent. 

Linkage analysis and establishment of a physical map 
Multiple DNA markers were genotyped in the region of 9q31 to which 
linkage to TD had been described (Rust et al, Nat. Genet. 20, 96-98, 1998). Two 
point linkage analysis gave a maximal peak LOD score of 6.49 at D9S1832 
(Table 1) with significant evidence of linkage to all markers in a -10 cM interval. 
Recombination with the most proximal marker, D9S1690 was seen in 11-09 in 
Family TD-1 (A"^ in Fig. 3D) providing a centromeric boundary for the disease 
gene. Multipoint linkage analysis of these data did not increase the precision of 
the positioning of the disease trait locus. 

A physical map spanning approximately 10 cM in this region was 
established with the development of a YAC contig (Fig. 3A). In addition, 22 other 
polymorphic multi-allelic markers which spanned this particular region were 
mapped to the contig (Fig. 3B) and a subset of these were used in construction of a 
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TABLE 1 

Two Point Linkage Analysis of TD-1 and TD-2 



LOD Score at recombination fraction 



Marker Locus 


0 


0.01 


0.05 


0.10 


0.20 


0.30 


0.40 


D9S1690 


-infini 


4.25 


4.52 


4.26 


3.39 


2.30 


1.07 


D9S277 


6.22 


6.11 


5.67 


5,10 


3.S0 


2.60 


1.17 


D9S1866 


4.97 


4.87 


4.49 


4.00 


2.95 


1.85 


0.70 


D9S1784 


5.50 


5.40 


5.00 


4.47 


3.36 


2.17 


0.92 


D9S1832 


6.49 


6.37 


5.91 


5.31 


4.05 


2.69 


1.21 


D9S1677 


4.60 


4.51 


4.18 


3.76 


2.88 


1.93 


0.93 



Results of pain*"isc linkage analysis using NILINK. Values correspond to the LOD score for linkage 
between the disease locus and a marker locus for specified values of the rcconibinaiion fraction. 
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TABLE 2. MicrosatellUe markers used in this study 



Genetic Markcn 


T}-pe 


Hetcro- 


Number 


Allele frequency' 










size.br (proponion) 


095283 


CA 


0.80 


10 


/, •wi\w«j"*^, IOJ\v,l7I, tOj\U.~wi« iw^\v.«j;, I *7/\U.U/J, w^U.Uii 










20I(0.<M): 203(0.04) 


D9S176 


CA 


0.82 


9 


129(0.03); 131(0.16); I33(0«£>: I35(0.i:\ 137(0.25): 139(0.03); U((O.OI); Ufi0.05); 










147(0.05) 






U./7 


Q 
C 


-.-Jiajo I, -«i /(g.i4); 229(0.0-3: Ij UO.U;. -.^vu.OjJ- 2j5(0.1o); 2j /(0.05); 139(0.05) 








15 


lo/iu.Oi), 17!(0X2); 173(0.15»; 1/5(0.10: IT7(0.07); 179(0.04); ISI(0.17); 153(0.06); 










lu^xu.w.j, io/^u.Ui;, io7(v.i^j; l7l(U.lJ]. ifJiu.u^j, i7/(u.lfU;; IS9(0.CX)) 






u. /z 


0 


»*»?iu.iij, IJKO.07); 153(0^;; [5j(0.0j); U7(U.43); 159(0.06) 


D9S306 


CA 


0.87 


13 


102(0.06); 104(0.01); 1 10(0X31: 1 12(0.CS^; 1 14(0.16): 1 16(0.15); 1 18(0.1 \y rOiO 23)- 










122(0.06): 124(0.06); 126(0.03): 15-I(0.0:j; 136(0.01) 


D9S1866 


CA 


0.62 


11 


248(0.06); 252(0.04); 254(0.01): 256(58); 253(0.03): 260(0.06); 262(0 0'>)- -»6^0 l**)- 










266(0.06): 268(0.03): 270(0.01) • 


D9S1784 


CA 


0.86 * 


15 


174(0,10); 176(0.02); 178(0.CC); lSO(O.0Sn 132(0.11); 184(0.22); 136(0.15): ISS(0.06): 










1 90(0.CW); 192(0.07): 194(0.(^ i: 196(0.07): 198(0.0 1 ); 200(0.0 1 ): 202(o]o I) ' 


AFIygl07xf9 


CA 


n.a. 


n.a. 


n.a. 


D9^ft70 


CA^ 


n.a. 


n.a. 


n.a. 




CA 


n.a. 


n.a. 


n.a. 


D9S2107 


CA 


0.63 


5 


n.a. 


D9502 


CA 


0.54 


5 


291(0,00): 297(0.05); 299(0 J2): 303{0.62); 305(0.02) 




CA 


0.51 


3 


1(0.42); 2(0 J6); 3(0.02) 


D9#32 


CA 


0.88 


12 


161(0.04); 163(0.02); J67(0.C2): 169(0.04); 171(0.10); 173(0.09): 175(0.15); 177(0.28); 










179(0.19); 181(0.04): 183(0.01): 185(0.01) 


D9SiB35 


CA 


0.48 


4 


1 10(0.02); 1 12(0.23); 1 16(0.63); 1 18(0.07) 




CA 


0.77 


10 


166(0.10); 172(0.04): 174(0.C2): 132(0.02); 184(0.19); 186(0.40); 188(0.15): 190(0.04); 










192(0.02); 194(0.02) 


D9S261 


CA 


0.63 


7 


90(0.02); 92(0.52); 94(0.02); 98(0.02): 100(0.29); 102(0.04): 104(0.08) 


L/70 1 UU 


CA 


O.o2 


6 


136(0.25): 138(0.53); 140(0.01); 142(0.12): 144(0.00); 146(0.07) 


D9S1677 


CA 


0.81 


10 


25 1(0.27); 257(0.27): 259(0.07); 26 1(0.09); 263(0Jn); 265(0.14); 267(0.02); 269(0.02); 










27 1(0.04); 273(0.02) 


D9S279 


CA 


0.78 


6 


244(0.09); 246(0.18); 248(0.29); 250(029); 252(0.07); 254(0.09) 


D9S275 


CA 


0.62 


4 


190(0,31); 196(0.07): 198(0 J 2): 200(0,09) 



In a Caucasian population of French Canadian or French descent (J. Weissenbach. Personnal Communication 1993). 
n.a - not assessed. 



These polymorphic microsatellite markers were used for DNA typing in the region of 9q3 1 seen 
in Fig.3. The majority come from the last version of the Genethon human liiJcage map The 
frequency of heierozygosiiy, the number of alleles as well as the allele frequency of each marker 
are presented. 
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haplotype for further analysis (Figs. lA and IB; Table 2). The condensed 
haplotype in these families is shown in Figs. 1 A and IB. 

While the family of Dutch decent did not demonstrate any consanguinity, 
the proband in TD-2 was the offspring of a first-cousin consanguineous marriage 
(Fig. IB). We postulated, therefore, that it was most likely that this proband would 
be homozygous for the mutation while the proband in the Dutch family was likely 
to be a compound heterozygote. The Dutch proband shows completely different 
mutation bearing haplotypes, supporting this hypothesis (Fig. 3C). 

The TD-2 proband was homozygous for all markers tested (Fig. IB) distal 
to D9S127 but was heterozygous at D9S127 and DNA markers centromeric to it 
(Fig. 3C). This suggested that the gene for TD was likely located to the genomic 
region telomeric of D9 SI 27 and encompassed by the markers demonstrating 
homozygosity (Fig. 3B). 

Mutation detection 

Based on the defect in intracellular cholesterol transport in patients with 
TD, we reviewed the EST database for genes in this region which might be 
relevant to playing a role in this process. One gene that we reviewed as a 
candidate was the lysophosphatidic acid (LP A) receptor (EDG2) which mapped 
near D9S1801 (Fig. 3C). This receptor binds LPA and stimulates phospholipase-C 
(PLC), and is expressed in fibroblasts. It has previously been shown that the 
coordinate regulation of PLC that is necessary for normal HDL3 mediated 
cholesterol efflux is impaired in TD (Walter et al., J. Clin, Invest. 98:2315-2323, 
1996). Therefore this gene represented an excellent candidate for the TD gene. 
Detailed assessment of this gene, using Northern blot and RT-PCR and sequencing 
analysis, revealed no changes segregating with the mutant phenotype in this 
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family, in all likelihood excluding this gene as the cause for TD. Polymorphisms 
were detected, however, in the RT-PCR product, indicating expression of 
transcripts from both alleles. 

The second candidate gene (RGS3) encodes a member of a family 
regulating G protein signaling which could also be involved in influencing 
cholesterol efflux (Mendez et al., Trans. Assoc. Amer. Phys. 104:48-53, 1991). 
This gene mapped 0.7 cM telomeric to the LPA-receptor (Fig. 3C), and is 
expressed in fibroblasts. It was assessed by exon-specific amplification, as its 
genomic organization was published (Chattegee et aL, Genomics 45:429-433, 
1997). No significant sequence changes were detected. 

The ABCl transporter gene had previously been mapped to 9q3 1, but its 
precise physical location had not been determined (Luciani et al., Genomics 
21:150-159, 1994). The ^5C; gene is a member of the ATP binding cassette 
transporters which represents a super family of highly conserved proteins involved 
in membrane transport of diverse substrates including amino acids, peptides, 
vitamins and steroid hormones (Luciani et al.. Genomics 21:150-159, 1994; Dean 
et aL, Curr. Opin. Gen. Dev. 5:779-785, 1995). Primers to the 3' UTR of this 
gene mapped to YACs spanning D9S306 (887-B2 and 930-D3) compatible with it 
being a strong candidate for TD. We initiated large scale genomic sequencing of 
BACs spanning approximately 800 kb around marker D9S306 (BACs 269, 274, 
279 and 291) (Fig 3E). The ABCl gene was revealed encompassing 49 exons and 
a minimum of 75 Kb of genomic sequence. In view of the potential function of a 
gene in this family as a cholesterol transporter, its expression in fibroblasts and 
localization to the minimal genomic segment underlying TD, we formally assessed 
ABCl as a candidate. 

Patient and control total fibroblast RNA was used in Northem blot analysis 
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and RT-PCR and sequence analyses. RT-PCR and sequence analysis of TD-1 
revealed a heterozygous T to C substitution (Fig. 4A) in the TD-1 proband, which 
would result in a substitution of arginine for cysteine at a conserved residue 
between mouse and man (Fig. 4B). This mutation, confirmed by sequencing exon 
30 of the ABCl gene, exhibited complete segregation with the phenotype on one 
side of this family (Fig. 4C). This substitution creates a Hgal site, allowing for 
RFLP analysis of amplified genomic DNA and confirmation of the mutation (Fig. 
4C). The point mutation in exon 30 was not seen on over 200 normal 
chromosomes from unaffected persons of Dutch decent, and 250 chromosomes of 
Western European decent, indicating it is unlikely to be a polymorphism. 
Northem blot analysis of fibroblast RNA from this patient, using a cDNA 
encompassing exons 1 to 49 of the gene, revealed a normal sized ~8 Kb transcript 
and a truncated mutant transcript which was not visible in control RNA or in RNA 
from other patients with HDL deficiency (Fig. 4D). Additionally, Northem blot 
analysis using clones encompassing discrete regions of the cDNA revealed that the 
mutant transcript was detected with a cDNA compassing exons 1 to 49 (a), 1 to 41 
(b), 1 to 22 (c), much more faintly with a probe spanning exon 23 to 29 (d) and not 
seen with probes encompassing exons 30 to 42 (e), but not seen with cDNA 
fragment spanning exons 30 to 49 (f). This was repeated on multiple filters with 
control RNA, RNA from other patients with HDL deficiency and the other TD 
proband, and only in TD-1 was the truncated transcript observed. Sequence 
analysis of the coding region did not reveal an alteration in sequence that could 
account for this finding. Furthermore, DNA analysis by Southem blot did not 
reveal any major rearrangements. Completion of exon sequencing in genomic 
DNA showed that this mutation was a G to C transversion at position (+1) of 
intron 24, (Fig. 11) affecting a splice donor site and causing aberrant splicing. 
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RT-PCR analysis of fibroblast RNA encoding the ABCl gene from the 
proband in TD-2 (Fig. IB) revealed a homozygous nucleotide change of A to G at 
nucleotide 1864 of SEQ ID NO: 2 in exon 13 (Fig. 5 A), resulting in a substitution 
of arginine for glutamine at residue 597 of SEQ ID NO: 1 (Fig. 5B), occurring just 
proximal to the first predicted transmembrane domain of ABCl (Fig. 8) at a 
residue conserved in mouse and as well as a C. elegans homolog. This mutation 
creates a second ^czl site within exon 13. Segregation analysis of the mutation in 
this family revealed complete concordance between the mutation and the low HDL 
phenotype as predicted (Fig 5C). The proband in TD-2 is homozygous for this 
mutation, consistent with our expectation of a disease causing mutation in this 
consanguineous family. 

Analysis of FHA families 

Linkage analysis and refinement of the minimal genomic region containing 
the gene for FHA 

Data from microsatellite typing of individual family members from the four 
pedigrees of French Canadian origin were analyzed (Fig. 2). A maximum LOD 
score of 9.67 at a recombination fraction of 0.0 was detected at D9S277 on 
chromosome 9q31 (Fig. 3; Table 3). Thereafter, 22 markers were typed in a region 
spanning 10 cM around this locus in these families (Figs. 2 and 3). The frequency 
for these markers were estimated from a sample of unrelated and unaffected 
subjects of French ancestry (Table 2). 

TD and FHA have thus far been deemed distinct with separate clinical and 
biochemical characteristics. Even though the genes for these disorders mapped to 
the same region, it was uncertain whether FHA and TD were due to mutations in 
the same gene or, altematively, due to mutations in genes in a similar region. 
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TABLE 3 
Two Point Linkage Analysis in FHA 





LOD Score at recombir.clion fraction 


Marker Lccus 


0 


0.01 


0.05 


0.10 


0.20 


0.30 


0.40 


D9S2S3 


-infini 


-2.57 


0.51 


1.48 


1.84 


1.48 


0.76 


095^75 


•infini 


1.42 


3.0"? 


3.39 


3.05 


2.22 


1.12 


D9S',c90 


•infini 


3.11 


4.04 


4 04 


3.33 


2.24 


0.96 


D9S277 


9.67 


9.51 


8.89 


8.06 


6.29 


4.30 


2.10 


D9S3C6 


5.60 


5.51 


5.13 


4.62 


3.55 


2.36 


1.11 


D9S:S66 


-infini 


7.24 


7.35 


6.87 


5.50 


3.82 


1.91 


D9S'7St 


-infini 


9.85 


9.76 


9.03 


7.09 


4.78 


2.25 


D9S'<72 


-infini 


2.53 


3.00 


2.87 


2.26 


1.50 


0.67 


D9S1S32 


-infini 


5.20 


5.97 


5.75 


4.S9 


3.02 


1.30 


D9S',601 


0.14 


0.13 


0.11 


0.09 


0.06 


0.03 


0.01 


D9S1677 


-infini 


7.83 


7.90 


7.38 


5.90 


4.08 


2.01 


D9S279 


♦infini 


3.43 


.3.60 


3.66 


3.01 


2.12 


1.05 


D9S2TS 


-infini 


2.57 


2.98 


2.91 


2.41 


1.69 


0.81 



Rcsulis of pairwisc linkage analysis using MLINK, Values corrcspcnd to the LOD score for linkage 
between die disease locus and a marker locus for specified values cf the recombination fraction. 
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Refinement of the region containing the gene for FHA was possible by examining 
haplotype sharing and identification of critical recombination events (Fig. 2). 
Seven separate meiotic recombination events were seen in these families ("A" 
through "G" in Figs. 2 and 3), clearly indicating that the minimal genomic region 
containing the potential disease gene was a region of approximately 4.4 cM 
genomic DNA spanned by marker D9S1690 and D9S1866 (Figs. 2 and 3). This 
region is consistent with the results of two point linkage analysis which revealed 
maximal LOD scores with markers D9S277 and D9S306 and essentially excluded 
the region centromeric to D9S1690 or telomeric to D9S1866. An 8^^ meiotic 
recombination event ("H" in Fig. 3) further refined the FHA region to distal to 
D9S277. 

As described herein, the ABCl gene mapped within this interval. The 
overlapping genetic data strongly suggested that FHA may in fact be allelic to TD. 
Utilization of sets of genetic data from FHA and TD provided a telomeric 
boundary at D9S1866 (meiotic recombinant) (Fig. 3D) and a centromeric marker 
at D9S127 based on the homozygosity data of TD-2. This refined the locus to 
approximately 1 Mb between D9S127 and D9S1866. The^^Ci gene mapped 
within this minimal region (Fig. 3E). 

Mutation detection in FHA 

Mutation assessment of the ABCl gene was undertaken in FHA-1 (Fig. 2A). 
Using primers that spanned overlapping segments of the mRNA we performed 
RT-PCR analysis and subjected these fi^agments to mutational analysis. A deletion 
of three nucleotides is evident in the RT-PCR sequence of FHA-1 III.Ol (Fig. 6A), 
resulting in a loss of nucleotides 2151-2153 of SEQ ID NO: 2 and deletion of a 
leucine (AL693) at amino acid position 693 of SEQ ID NO: 1 (Fig. 6A). This 
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leucine is conserved in mouse and C elegans (Fig. 6B). The alteration was 
detected in the RT-PCR products as well as in genomic sequence from exon 14 
specific amplification. This mutation results in a loss of an Earl restriction site. 
Analysis of genomic DNA from the family indicated that the mutation segregated 
completely with the phenotype of HDL deficiency. The loss of the Earl site 
results in a larger fragment being remaining in persons heterozygous for this 
mutation (Fig. 6C). This mutation maps to the first putative transmembrane 
domain of ABCl (Fig. 8) and was not seen in 130 chromosomes from persons of 
French Canadian descent nor seen in over 400 chromosomes from persons of other 
Western European ancestry. 

A mutation has also been found in patient genomic DNA in pedigree 
FHA-3 from Quebec. The alteration, a 6 bp deletion of nucleotides 5752-5757 of 
SEQ ID NO: 2 within exon 41, results in a deletion of amino acids 1893 (Glu) and 
1894 (Asp) of SEQ ID NO: 1. The deletion was detected as a double, 
superimposed, sequence starting from the point of the deletion (Fig. 6D), and was 
detected in sequence reads in both directions. The deletion can be detected on 3% 
agarose or 10% polyacrylamide gels, and segregates with disease in FHA-3. It 
was not seen in 128 normal chromosomes of French-Canadian origin or in 434 
other control chromosomes. Amino acids 1893 and 1894 are in a region of the 
ABCl protein that is conserved between human, mouse, and C elegans (Fig. 6E), 
implying that it is of functional importance. 

An additional mutation has been found in patient genomic DNA in pedigree 
FHA-2 from Quebec (Fig. 6F). The alteration, a C to T transition at position 6504 
of SEQ ID NO: 2, converts an arginine at position 2144 of SEQ ID NO: 1 to a 
STOP codon, causing truncation of the last 118 amino acids of the ABCl protein. 
This alteration segregates with disease in family FHA-2. 
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A summary of all mutations and polymorphisms found in ABCl is shown 
in Fig. 1 1 . Each variant indicated as a mutation segregates with low HDL in its 
family, and was not seen in several hundred control chromosomes. 

Functional relationship between changes in ABCl transcript levels and 
cholesterol efflux 

Antisense approaches were undertaken to decrease the ABCl transcript and 
assess the effect of alteration of the transcript on intracellular cholesterol transport. 
The use of antisense primers to the 5' end of ABCl clearly resulted in a decrease 
to approximately 50% of normal RNA levels (Fig. 7A). This would be expected to 
mimic in part the loss of function due to mutations on one allele, similar to that 
seen in heterozygotes for TD and patients with FHA. Importantly, reduction in the 
mRNA for the ABCl gene resulted in a significant reduction in cellular cholesterol 
efflux (Fig. 7B), further establishing the role of this protein in reverse cholesterol 
transport and providing evidence that the mutations detected are likely to 
constitute loss of function mutations. Furthermore, these data support the 
functional importance of the first 60 amino acids of the protein. Antisense 
oligonucleotide AN-6 is directed to the novel start codon 5' to the one indicated in 
AJO 12376.1; this antisense oligonucleotide effectively suppresses efflux. 

The above-described results were obtained using the following materials 
and methods. 

Patient selection 

The probands in TD families had previously been diagnosed as suffering 
from TD based on clinical and biochemical data. Study subjects with FHA were 
selected from the Cardiology Clinic of the Clinical Research Institute of Montreal. 
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The main criterion was an HDL-C level <5tli percentile for age and gender, with a 
plasma concentration of triglycerides <95th percentile in the proband and a first- 
degree relative with the same lipid abnormality. In addition, the patients did not 
have diabetes. 

5 

Biochemical studies 

Blood was withdrawn in EDTA-containing tubes for plasma lipid, 
lipoprotein cholesterol, ApoAI, and triglyceride analyses, as well as storage at 
-80''C. Leukocytes were isolated from the buffy coat for DNA extraction. 

10 Lipoprotein measurement was performed on fresh plasma as described 

elsewhere (Rogler et al, Arterioscler. Thromb. Vase. Biol. 15:683-690, 1995). 
The laboratory participates and meets the criteria of the Lipid Research Program 
Standardization Program. Lipids, cholesterol and triglyceride levels were 
determined in total plasma and plasma at density d<l .006 g/mL (obtained after 

15 preparative ultracentrifiigation) before and after precipitation with dextran 

manganese. Apolipoprotein measurement was performed by nephelometry for 
ApoB and ApoAI. 

Linkage analysis 

20 Linkage between the trait locus and microsatellite loci was analyzed using 

the FASTLINK version (4.0 P). FASTLINK/MLINK was used for two-point 
linkage analysis assuming an autosomal dominant trait with complete penetrance. 
In FHA and TD heterozygotes, the phenotype was HDL deficiency <5th percentile 
for age and sex. The disease allele frequency was estimated to be 0.005. Marker 

25 allele frequencies were estimated from the genotypes of the founders in the 

pedigrees using NEWPREP. Multipoint linkage analysis was carried out using 
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FASTLINK/LINKMAP. 



Genomic clone assembly and physical map construction of the 9q31 region 

Using the Whitehead Institute/MIT Center for Genome Research map as a 
reference, the genetic markers of interest at 9q31 were identified within YAC 
contigs. Additional markers that mapped to the approximate 9q3 1 interval from 
public databases and the literature were then assayed against the YAC clones by 
PCR and hybridization analysis. The order of markers was based on their presence 
or absence in the anchored YAC contigs and later in the BAC contig. Based on 
the haplotype analysis, the region between D9S277 and D9S306 was targeted for 
higher resolution physical mapping studies using bacterial artificial chromosomes 
(BACs). BACs within the region of interest were isolated by hybridization of 
DNA marker probes and whole YACs to high-density filters containing clones 
from the RPCI-1 1 human BAC library (Fig. 3). 

Sequence retrieval and alignment 

The human ABCl mRNA sequence was retrieved from GenBank using the 
Entrez nucleotide query (Baxevanis et al., A Practical Guide to the Analysis of 
Genes and Proteins, eds. Baxevanis, A.D. & Ouellette, B.F.F. 98:120, 1998) as 
GenBank accession number AJO 123 76.1. The version of the protein sequence we 
used as wild-type (normal) was CAAl 0005.1. 

We identified an additional 60 amino acids in- frame with the previously- 
believed start methionine (Fig. 9A). Bioinformatic analysis of the additional 
amino acids indicates the presence of a short stretch of basic amino acid residues, 
followed by a hydrophobic stretch, then several polar residues. This may represent 
a leader sequence, or another transmembrane or membrane-associated region of 
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the ABCl protein. In order to differentiate among the foregoing possibiHties, 
antibodies directed to the region of amino acids 1-60 are raised against and used to 
determine the physical relationship of amino acids 1-60 in relation to the cell 
membrane. Other standard methods can also be employed, including, for example, 
expression of fusion proteins and cell fractionation. 

We also identified six errors in the previously-reported nucleotide sequence 
(at positions 839, 4738, 5017, 5995, 6557, and 6899 of SEQ ID NO: 2; Fig. 1 1). 
Hence, the sequence of the ABCl polypeptide of Fig. 9 A differs from 
CAA10005.1 as follows: Thr^Ile at position 1554; Pro^Leu at position 1642; 
Arg^Lys at position 1973; and Pro^Leu at position 2167. We also identified 5' 
and 3' UTR sequence (Figs. 9B - 9E). 

The mouse ABCl sequence used has accession number X75926. It is very 
likely that this mouse sequence is incomplete, as it lacks the additional 60 amino 
acids described herein for human ABCl . 

Version 1.7 of ClustalW was used for multiple sequence alignments with 
BOXSHADE for graphical enhancement (http://www.isrec.isb-sib.ch: 8080/ 
software/BOX form.html) with the default parameter. A Caenorhabditis elegans 
ABCl orthologue was identified with BLAST (version 2.08) using CAA1005.1 
(see above) as a query, with the default parameter except for doing an organism 
filter for C elegans. The selected protein sequence has accession version number 
AAC69223.1 with a score of 375, and an E value of 103. 

Genomic DNA sequencing 

BAC DNA was extracted from bacterial cultures using NucleoBond 
Plasmid Maxi Kits (Clontech, Palo Alto, CA). For DNA sequencing, a sublibrary 
was first constructed from each of the BAC DNAs (Rowen et aL, Automated DNA 
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Sequencing and Analysis, eds. Adams, M.D., Fields, C. & Venter, J.C., 1994). In 
brief, the BAG DNA was isolated and randomly sheared by nebulization. The 
sheared DNA was then size fractionated by agarose gel electrophoresis and 
fragments above 2 kb were collected, treated with Mung Bean nuclease followed 
5 by T4 DNA polymerase and klenow enzyme to ensure blunt-ends, and cloned into 
Smal'Cut M13mpl9. Random clones were sequenced with an ABI373 or 377 
sequencer and fluorescently labeled primers (Applied BioSystems, Foster City, 
CA). DNAStar software was used for gel trace analysis and contig assembly. All 
DNA sequences were examined against available public databases primarily using 
1 0 BL ASTn with RepeatMasker (University of Washington) . 

Reverse transcription (RT)-PCR amplification and sequence analysis 

Total RNA was isolated from the cultured fibroblasts of TD and FHA 
patients, and reverse transcribed with a CDS primer containing oligo d(T)18 using 

15 250 units of Superscript II reverse transcriptase (Life Technologies, Inc., 

Rockville, MD) as described (Zhang et al,, J. Biol. Chem, 27:1776-1783, 1996). 
cDNA was amplified with Taq DNA polymerase using primers derived from the 
published human ^5C7 cDNA sequence (Luciani et al., Genomics 21:150-159, 
1994). Six sets of primer pairs were designed to amplify each cDNA sample, 

20 generating six DNA fragments which are sequentially overlapped covering 135 to 
7014 bp of the full-length human ABCl cDNA. The nucleotides are numbered 
according to the order of the published human cDNA sequence (AJ012376.1). 
Primer pairs (1): 135-158 (f) and 1183-1199 (r); (2): 1080-1107 (f) and 2247-2273 
(r); (3): 2171-2197 (f) and 3376-3404 (r); (4): 3323-3353 (f) and 4587-4617 (r); 

25 (5) 4515-4539 (f) and 5782-5811 (r); (6): 5742-5769 (f) and 6985-7014 (r). RT- 
PCR products were purified by Qiagen spin columns. Sequencing was carried out 
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in a Model 3 73 A Automated DNA sequencer (Applied Biosy stems) using Taq di- 
deoxy terminator cycle sequencing and Big Dye Kits according to the 
manufacturer's protocol. 



5 Northern blot analysis 

Northern transfer and hybridizations were performed essentially as 
described (Zhang et al., J. Biol. Chem. 27:1776-1783, 1996). Briefly, 20 pig of 
total fibroblast RNA samples were resolved by electrophoresis in a denaturing 
agarose (1.2%; w/v) gel in the presence of 7% formaldehyde, and transferred to 
10 nylon membranes. The filters were probed with -^^P-labeled human ABCl cDNA 
as indicated. Pre-hybridization and hybridizations were carried out in an 
ExpressHyb solution (ClonTech) at 68°C according to the manufacturer's 
protocol. 

15 Detection of the mutations in TD 

Genotyping for the T4503C and A1864G variants was performed by PGR 
amplification of exon 30 followed by restriction digestion with Hgal and 
amplification of exon 13 followed by digestion with^c/I, respectively. PGR was 
carried out in a total volume of 50 \xL with 1.5 mM MgGl2, 187.5 nM of each 

20 dNTP, 2.5U Taq polymerase and 15 pmol of each primer (forward primer in exon 
30: 5'-GTG GGA GGG AGG GGA GGA AGA GTG-3' (SEQ ID NO: 4); reverse 
primer spanning the junction of exon 30 and intron 30: 5'-GAA AGT GAG TCA 
CTT GTG GAG GA-3' (SEQ ID NO: 5); forward primer in intron 12: 5'-AAA 
GGG GCT TGG TAA GGG TA-3' (SEQ ID NO: 6); reverse in intron 13: 5'-CAT 

25 GCA CAT GGA CAC ACA TA -3 ' (SEQ ID NO: 7)). Following an initial 

denaturation of 3 minutes at 95°C, 35 cycles consisting of 95°C 10 seconds, 58^C 
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30 seconds, 72°C 30 seconds were performed, with a final extension of 10 minutes 
at 72°C. For detection of the T4503C mutation, 15 |xL of exon 30 PGR product 
was incubated with 4 U Hgal in a total volume of 25 |liL, for 2 hours at 37°C, and 
the resulting fragments were separated on a 1.5% agarose gel. The presence of the 
5 T4503C mutation creates a restriction site for Hgal, and thus the 194 bp PGR 

product will be cut into fragments of 134 and 60 bp in the presence of the T4503C 
variant, but not in its absence. For detection of the A1864G mutation, 15 |j,L of 
exon 13 PGR products were digested with 8 U Acil for three hours at 37^G. 
Products were separated on 2% agarose gels. The presence of the A1864G 
10 mutation creates a second Acil site within the PGR product. Thus, the 360 bp PGR 
product is cleaved into fragments of 215 bp and 145 bp on the wild-type allele, but 
185 bp, 145 bp and 30 bp on the mutant allele. 

Detection of mutation in FHA 

15 Genotyping for the A693 variant was performed by PGR amplification of 

exon 14 followed by restriction enzyme digestion with Earl. PGR was carried out 
in a total volume of 80 |uL with 1.5 mM MgGls, 187.5 nM of each dNTP, 2.5 U 
Taq polymerase and 20 pmol of each primer (forward primer in exon 14: 5'- GTT 
TGT GGG GGT GAT GAG GGG GTG AAT-3' (SEQ ID NO: 8); reverse primer 

20 in intron 14: 5'-GGT TAG GGG GTG TTG AGG TA-3 ' (SEQ ID NO: 9)). 

Following an initial denaturation of 3 minutes at 95''G, 35 cycles consisting of 
95°G 10 seconds, 55°G 30 seconds, 72''G 30 seconds were performed, with a final 
extension of 10 minutes at 72°G. Twenty microliters of PGR product was 
incubated with 4 U Earl in a total volume of 25 |iL, for two hours at 37°G, and the 

25 fragments were separated on a 2 % agarose gel. The presence of the A693 

mutation destroys a restriction site for Earl, and thus the 297 bp PGR product will 
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be cut into fragments of 151 bp, 59 bp, 48 bp and 39 bp in the presence of a wild- 
type allele, but only fragments of 210 bp, 48 bp and 39 bp in the presence of the 
deletion. 

A 6 bp deletion encompassing nucleotides 5752-5757 (inclusive), was 
5 detected in exon 41 in the proband of family FHA-3 by genomic sequencing using 
primers located within the introns flanking this exon. Genotyping of this mutation 
in family FHA-3 and controls was carried out by PGR with forward (5'-CGT GTA 
AAT GGA AAG GTA TGT GGT GT- 3' (SEQ ID NO: 10)) and reverse primers 
(5'-CGT GAA GTG GTT GAT TTG TAA GAT GT (SEQ ID NO: 1 1)) located 

10 near the 5' and 3' ends of exon 41, respectively. Each PGR was carried out as for 
the genotyping of the 693 variant, but with annealing temperature of SS^'G. 
Twenty microliters of PGR product was resolved on 3% agarose or 10% 
acrylamide gels. The wild type allele was detected as a 1 17 bp band and the 
mutant allele as a 1 1 1 bp band upon staining with ethidium bromide. 

15 A G to T transition was detected at nucleotide 6504 in genomic DNA of the 

proband of family FHA-2. It was detectable as a double G and T peak in the 
genomic sequence of exon 48 of this individual, who is heterozygous for the 
alteration. This mutation, which creates a STOP codon that results in truncation of 
the last 118 amino acids of the ABGl protein, also destroys an Rsal restriction site 

20 that is present in the wild type sequence. Genotyping of this mutation in family 
FHA-2 and controls was carried out by PGR with forward (5'-GGG TTG GGA 
GGG TTG AGT AT-3') (SEQ ID NO: 12)) and reverse (5'-GAT GAG GAA TTG 
AAG GAG GAA-3') (SEQ ID NO: 13)) primers directed to the intronic sequences 
flanking exon 48. PGR was done as for the 693 variant. Fifteen microliters of 

25 PGR product was digested with 5 Units of Rsal at 37''G for two hours and the 

digestion products resolved on 1.5% agarose gels. The mutant allele is detected as 
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an uncut 436 bp band. The normal sequence is cut by Rsal to produce 332 and 104 
bp bands. 

Cell culture 

Skin fibroblast cultures were established from 3.0 mm punch biopsies of the 
forearm of FHD patients and healthy control subjects as described (Marcil et al., 
Arterioscler. Thromb. Vase. Biol. 19:159-169, 1999). 

Cellular cholesterol labeling and loading 

The protocol for cellular cholesterol efflux experiments was described in 
detail elsewhere (Marcil et al, Arterioscler. Thromb. Vase. Biol. 19:159-169, 
1999). The cells were ^H-cholesterol labeled during growth and free cholesterol 
loaded in growth arrest. 

Cholesterol efflux studies 

Efflux studies were carried out from 0 to 24 hours in the presence of 
purified ApoAI (10 |ig protein/mL medium). Efflux was determined as a percent 
of free cholesterol in the medium after the cells were incubated for specified 
periods of time. All experiments were performed in triplicate, in the presence of 
cells from one control subject and the cells from the study subjects to be examined. 
All results showing an efflux defect were confirmed at least three times. 

Oligonucleotide synthesis 

Eight phosphorothioate deoxyoligonucleotides complementary to various 
regions of the human ^5Ci cDNA sequence were obtained from GIBCO BRL. 
The oligonucleotides were purified by HPLC. The sequences of the antisense 
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oligonucleotides and their location are listed. One skilled in the art will recognize 
that other ABC 1 antisense sequences can also be produced and tested for their 
ability to decrease ABC 1 -mediated cholesterol regulation. 



Name 


Sequence (5'- 3') 


mRNA target 


control 








AN-1 


GCAGAGGGCATGGCTTTATTTG (SEQ ID NO: 3) 


AUG codon 


46 


AN-2 


GTGTTCCTGCAGAGGGCATG (SEQ ID NO: 30) 


AUG codon 


50 


AN-3 


CACTTCCAGTAACAGCTGAC (SEQ ID NO: 31) 


5 '-Untranslated 


79 


AN-4 


CTTTGCGCATGTCCTTCATGC (SEQ ID NO: 32) 


Coding 


80 


AN-5 


GACATCAGCCCTCAGCATCTT (SEQ ID NO: 33) 


Coding 


120 


AN-6: 


CAACAAGCCATGTTCCCTC (SEQ ID NO: 34) 


Coding 




AN-7: 


CATGTTCCCTCAGCCAGC (SEQ ID NO: 35) 


Coding 




AN-8: 


CAGAGCTCACAGCAGGGA C (SEQ ID NO: 36) 


Coding 





15 

Cell transfection with antisense oligonucleotides 

Cells were grown in 35 mm culture dishes until 80 % confluent, then 
washed once with DMEM medium (serum and antibiotics free). One milliliter of 
DMEM (serum and antibiotics free) containing 500 nM antisense oligonucleotides 

20 and 5 )ig/ml or 7.5 |Lig/ml of lipofectin (GIBCO BRL) were added to each well 

according to the manufacturer's protocol. The cells were incubated at 37^C for 4 
hours, and then the medium was replaced by DMEM containing 10% PCS. 
Twenty-four hours after the transfection, the total cell RNA was isolated. Ten 
micrograms of total RNA was resolved on a 1% of agarose-formaldehyde gel and 

25 transferred to nylon membrane. The blot was hybridized with a-^^P dCTP labeled 
human ABCl cDNA overnight at 68^C. The membrane was subsequently exposed 
to x-ray film. The hybridizing bands were scanned by optical densitometry and 
standard to 28S ribosome RNA. 
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Cholesterol efflux with anti-ABCl oligonucleotides 

Human skin fibroblasts were plated in 6-well plates. The cells were labeled 
with ^H-cholesterol (0.2 iiiCi /ml) in DMEM with 10% FBS for two days when the 
cell reached 50% confluence. The cells were then transfected with the antisense 
ABCl oligonucleotides at 500nM in DMEM (serum and antibiotic free) with 7.5 
|ug/ml Lipofectin (GIBCO BRL) according to the manufacturer's protocol. 
Following the transfection, and the cells were loaded with nonlipoprotein (20 
|Lig/ml) for 12 hours in DMEM containing 2 mg/ml BSA without serum. The 
cellular cholesterol pools were then allowed to equilibrate for 6 hours in DMEM- 
BSA. The cholesterol efflux mediated by ApoAI (10|ig /ml, in DMEM-BSA) 
were then carried out which is 48 hours after transfection. 

Radiolabeled cholesterol released into the medium is expressed as a 
percentage of total ^H-cholesterol per well (medium + cell ). Results are the mean 
+/- SD of triplicate dishes. 

Determination of genomic structure of the ABCl gene 

Most splice junction sequences were determined from genomic sequence 
generated from BAC clones spanning the ABCl gene. More than 160 kb of 
genomic sequence were generated. Genomic sequences were aligned with cDNA 
sequences to identify intron/exon boundaries. In some cases, long distance PGR 
between adjacent exons was used to amplify intron/exon boundary sequences 
using amplification primers designed according to the cDNA sequence. 
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Functionality of the newly-discovered 60 amino acids at the N-terminus 
Antisense experiments 

Phosphorothioate antisentisense oligonucleotides were designed to be 
complementary to the regions of the cDNA near newly discovered translation start 
5 site. AN-6 and AN-7 both overlap the initiator methionine codon; this site is in 

the middle of oligonucleotide AN-6. AN-8 is complementary to the very 5' end of 
the ABCl cDNA. Antisense oligonucleotide AN-1 is complementary to the region 
of the ABCl cDNA corresponding to the site identified as the ABCl initiator 
methionine in AJ012376. Fig. 7C shows that antisense oligonucleotide AN-6 

10 interferes with cellular cholesterol efflux in normal fibroblasts to the same extent 
as does antisense oligonucleotide AN-1. Transfection with either of these 
antisense oligonucleotides results in a decrease in cellular cholesterol efflux almost 
as severe as that seen in FHA cells. In general, antisense oligonucleotides 
complementary to coding sequences, especially near the 5' end of a gene's coding 

15 sequence, are expected to be more effective in decreasing the effective amount of 
transcript than are oligonucleotides directed to more 3' sequences or to non-coding 
sequences. The observation that AN-6 depresses cellular cholesterol efflux as 
effectively as AN-1 implies that both of these oligonucleotides are complementary 
to ABCl coding sequences, and that the amino terminal 60 amino acids are likely 

20 to be contained in ABCl protein. In contrast, the ineffectiveness of AN-8 shows 

that it is likely to be outside the protein coding region of the transcript, as predicted 
by presence of an in- frame stop codon between the initiator methionine and the 
region targeted by AN-8. 
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Antibody experiments 

Polyclonal and monoclonal antibodies have been generated using peptides 
corresponding to discrete portions of the ABCl amino acid sequence. One of 
these, 20-amino acid peptide #2 (Pep2: CSVRLSYPPYEQHECHFPNKA (SEQ ID 
NO: 37), in which the N-terminal cysteine was added to facilitate conjugation of 
the peptide) corresponds to a protein sequence within the 60 amino-terminal amino 
acids of the newly-discovered ABCl protein sequence. The peptide was coupled 
to the KLH carrier protein and 300 |ig injected at three intervals into two Balb/c 
mice over a four week period. The spleen was harvested from the mouse with the 
highest ELISA-determined immune response to free peptide, and the cells fused to 
NS-1 myeloma cells by standard monoclonal antibody generation methods. 
Positive hybridomas were selected first by ELISA and then further characterized 
by western blotting using cultured primary human fibroblasts. Monoclonal cell 
lines producing a high antibody titre and specifically recognizing the 245 kD 
human ABCl protein were saved. The same size ABCl protein product was 
detected by antibodies directed to four other discrete regions of the same protein. 
The 245 kD band could be eliminated in competition experiments with appropriate 
free peptide, indicating that it represents ABCl protein (Fig. 13). 

The foregoing experiments indicate that ABCl protein is detected not only 
by antibodies corresponding to amino acid sequences within the previously- 
described ABCl amino acid sequence, but also by the Pep2 monoclonal antibody 
that recognizes an epitope within the newly-discovered N-terminal 60 amino acids. 
The N-terminal 60 amino acid region is therefore coding, and is part of the ABCl 
protein. 

The epitope recognized by the Pep2 monoclonal antibody is also conserved 
among human, mouse, and chicken. Liver tissues from these three species 
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employed in a Western blot produced an ABCl band of 245 kD when probed with 
the Pep2 monoclonal antibody. This indicates that the 60 amino acid N-terminal 
sequence is part of the ABCl coding sequence in humans, mice, and chickens. 
Presence of this region is therefore evolutionarily conserved and likely to be of 
5 important functional significance for the ABCl protein. 



Bioinformatic analyses of ABCl protein sequences 

Transmembrane prediction programs indicate 13 transmembrane (TM) 
regions, the first one being between amino acids 26 and 42 

10 (http://psort.nibb.ac.jp:8800/psort/helpwww2.html#ealom). The tentative number 
of TM regions for the threshold 0.5 is 13. (INTEGRAL Likelihood - -7.75 
Transmembrane 26-42). The other 12 TM range in value between -0.64 and -12 
(full results below). It is therefore very likely that the newly-discovered 60 amino 
acids contain a TM domain, and that the amino end of ABCl may be on the 

15 opposite side of the membrane than originally thought. 



20 



25 



30 



ALOM: TM region allocation 
Init position for calculation: 1 
Tentative number of TMs for the threshold 0.5: 13 
EMTEGRAL Likelihood = -7.75 Transmembrane 
INTEGRAL Likelihood = -3.98 
INTEGRAL Likelihood = -8.70 
INTEGRAL Likelihood- -9.61 
INTEGRAL Likelihood = - 1 .44 
INTEGRAL Likelihood = -0.64 
INTEGRAL Likelihood = - 1 .28 
INTEGRAL Likelihood =-12.79 
INTEGRAL Likelihood = -8.60 
INTEGRAL Likelihood = -6.79 
INTEGRAL Likelihood = -3.40 
ESfTEGRAL Likelihood = -1.49 
INTEGRAL Likelihood = -8.39 
PERIPHERAL Likelihood = 0.69 (at 1643) 
ALOM score: -12.79 (number of TMSs: 13) 



26- 42 
Transmembrane 640 - 656 
Transmembrane 690 - 706 
Transmembrane 717-733 
Transmembrane 749 - 765 
Transmembrane 771 - 787 
Transmembrane 1041 -1057 
Transmembrane 1351 -1367 
Transmembrane 1661 -1677 
Transmembrane 1708 -1724 
Transmembrane 1737-1753 
Transmembrane 1775 -1791 
Transmembrane 1854 -1870 



35 
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There does not appear to be an obvious cleaved peptide, so this first 60 
amino acid residues are not Hkely to be cleaved, and are therefore not specifically 
a signal/targeting sequence. No other signals (e.g., for targeting to specific 
organelles) are apparent. 

Agonists and Antagonists 

Useful therapeutic compounds include those which modulate the 
expression, activity, or stability of ABCl. To isolate such compounds, ABCl 
expression, biological activity, or regulated catabolism is measured following the 
addition of candidate compounds to a culture medium of ABCl -expressing cells. 
Altematively, the candidate compounds may be directly administered to animals 
(for example mice, pigs, or chickens) and used to screen for their effects on ABCl 
expression. 

In addition its role in the regulation of cholesterol, ABCl also participates 
in other biological processes for which the development of ABCl modulators 
would be useful. In one example, ABCl transports interleukin-lp (IL-IP) across 
the cell membrane and out of cells. IL-lp is a precursor of the inflammatory 
response and, as such, inhibitors or antagonists of ABCl expression or biological 
activity may be useful in the treatment of any inflammatory disorders, including 
but not limited to rheumatoid arthritis, systemic lupus erythematosis (SLE), hypo- 
or hyper- thyroidism, inflammatory bowel disease, and diabetes mellitus. In 
another example, ABCl expressed in macrophages has been shown to be engaged 
in the engulfinent and clearance of dead cells. The ability of macrophages to 
ingest these apoptotic bodies is impaired after antibody-mediated blockade of 
ABCl. Accordingly, compounds that modulate ABCl expression, stability, or 
biological activity would be useful for the treatment of these disorders. 
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ABCl expression is measured^ for example, by standard Northern blot 
analysis using an ABCl nucleic acid sequence (or fragment thereof) as a 
hybridization probe, or by Western blot using an anti-ABCl antibody and standard 
techniques. The level of ABCl expression in the presence of the candidate 
molecule is compared to the level measured for the same cells, in the same culture 
medium, or in a parallel set of test animals, but in the absence of the candidate 
molecule. ABCl activity can also be measured using the cholesterol efflux assay. 

Transcriptional Regulation of ABCl Expression 

ABCl mRNA is increased approximately 8-fold upon cholesterol loading. 
This increase is likely controlled at the transcriptional level. Using the promoter 
sequence described herein, one can identify transcription factors that bind to the 
promoter by performing, for example, gel shift assays, DNAse protection assays, 
or in vitro or in vivo reporter gene-based assays. The identified transcription 
factors are themselves drug targets. In the case of ABCl, drug compounds that act 
through modulation of transcription of ABCl could be used for HDL modulation, 
atherosclerosis prevention, and the treatment of cardiovascular disease. For 
example, using a compound to inhibit a transcription factor that represses ABCl 
would be expected to result in up-regulation of ABCl and, therefore, HDL levels. 
In another example, a compound that increases transcription factor expression or 
activity would also increase ABCl expression and HDL levels. 

Transcription factors known to regulate other genes in the regulation of 
apolipoprotein genes or other cholesterol- or lipid-regulating genes are of 
particular relevance. Such factors include, but are not limited to, the steroid 
response element binding proteins (SREBP-1 and SREBP-2), the PPAR 
(peroxisomal proliferation-activated receptor) transcription factors. Several 
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consensus sites for certain elements are present in the sequenced region 5' to the 
ABCl gene (Fig. 16) and are likely to modulate ABCl expression. For example, 
PPARs may alter transcription of ABCl by mechanisms including 
heterodimerization with retinoid X receptors (RXRs) and then binding to specific 
proliferator response elements (PPREs). Examples of such PPARs include 
PPARa, Y and 6. These distinct PPARs have been shown to have 
transcriptional regulatory effects on different genes. PPARa is expressed mainly 
in liver, whereas PPARy is expressed in predominantly in adipocytes. Both 
PPARa and PPARy are found in coronary and carotid artery atherosclerotic 
plaques and in endothelial cells, smooth muscle cells, monocytes and 
monocyte-derived macrophages. Activation of PPARa results in altered 
lipoprotein metabolism through PPARa 's effect on genes such as lipoprotein 
lipase (LPL), apolipoprotein CIII (apo CIII) and apolipoprotein AI (apo AI) and 
All (apo All). PPARa activation results in overexpression of LPL and apoA-I and 
apoA-II, but inhibits the expression of apo CIIL PPARa activation also inhibits 
inflammation, stimulates lipid oxidation and increases the hepatic uptake and 
esterification of free fatty acids (FFA's). PPARa and PPARy activation may 
inhibit nitric oxide (NO) synthase in macrophages and prevent interleukin-1 (IL-1) 
induced expression of IL-6 and cyclo-oxygenase-2 (COX-2) and thrombin induced 
endothelin-1 expression secondary to negative transcriptional regulation of NF-KB 
and activation of protein- 1 signaling pathway. It has also been shown that PPARa 
induces apoptosis in monocyte-derived macrophages through the inhibition of 
NF-KB activity. 

Activation of PPARa can be achieved by compounds such as fibrates, 
p-estradiol, arachidonic acid derivatives, WY- 14,643 and LTB4 or 8(s)HETE. 
PPARy activation can be achieved through compounds such as thiozolidinedione 
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antidiabetic drugs, 9-HODE and 13-HODE. Additional compounds such as 
nicotinic acid or HMG CoA reductase inhibitors may also alter the activity of 
PPARs. 

Compounds which alter activity of any of the PPARs (e.g., PPARa or 
PPARy) may have an effect on ABCl expression and thereby could affect HDL 
levels, atherosclerosis and risk of CAD. PPARs are also regulated by fatty acids 
(including modified fatty acids such as 3 thia fatty acids), leukotrienes such as 
leukotriene B4 and prostaglandin J2, which is a natural activator/ligand for 
PPARy. Drugs that modulate PPARs may therefore have an important effect on 
modulating lipid levels (including HDL and triglyceride levels) and altering CAD 
risk. This effect could be achieved through the modulation of ABCl gene 
expression. Drugs may also effect ABCl gene expression and thereby HDL 
levels, by an indirect effect on PPARs via other transcriptional factors such as 
adipocyte differentiation and determination factor- 1 (ADD-1) and sterol regulatory 
element binding protein-1 and 2 (SREBP-1 and 2). Drugs with combined PPARa 
and PPARy agonist activity or PPARa and PPARy agonists given in combination 
for example, may increase HDL levels even more. 

A PPAR binding site (PPRE element) is found 5' to the ABCl gene 
(nucleotides 2150 to 2169 of SEQ ID NO: 14). Like the PPRE elements found in 
the C-ACS, HD, CYP4A6 and ApoA-I genes, this PPRE site is a trimer related to 
the PPRE consensus sequence. Partly because of its similarity in the number and 
arrangement of repeats in this PPAR binding site, this element in particular is very 
likely to be of physiological relevance to the regulation of the ABCl gene. 

Additional transcription factors which may also have an effect in 
modulating ABCl gene expression and thereby HDL levels, atherosclerosis and 
CAD risk include; REV-ERBa, SREBP-1 & 2, ADD-1, EBPa, CREB binding 
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protein, P300, HNF 4, RAR, LXR, and RORa. Additional degenerate binding 
sites for these factors can be found through examination of the sequence in SEQ 
ID NO: 14. 

Additional utility of ABC 1 polypeptides, nucleic acids, and modulators 

ABCl may act as a transporter of toxic proteins or protein fragments (e.g., 
APP) out of cells. Thus, ABCl agonists/upregulators may be useful in the 
treatment of other disease areas, including Alzheimer's disease, Niemann-Pick 
disease, and Huntington's disease. 

ABC transporters have been shown to increase the uptake of long chain 
fatty acids from the cytosol to peroxisomes and, moreover, to play a role in |3- 
oxidation of very long chain fatty acids. Importantly, in x-linked 
adrenoleukodystrophy (ALD), fatty acid metabolism is abnormal, due to defects in 
the peroxisomal ABC transporter. Any agent that upregulates ABC transporter 
expression or biological activity may therefor be useful for the treatment of ALD 
or any other lipid disorder. 

ABCl is expressed in macrophages and is required for engulfment of cells 
undergoing programmed cell death. The apoptotic process itself, and its 
regulation, have important implications for disorders such as cancer, one 
mechanism of which is failure of cells to undergo cell death appropriately. ABCl 
may facilitate apoptosis, and as such may represent an intervention point for 
cancer treatment. Increasing ABCl expression or activity or otherwise up- 
regulating ABCl by any method may constitute a treatment for cancer by 
increasing apoptosis and thus potentially decreasing the aberrant cellular 
proliferation characterized by this disease. Conversely, down-regulation of ABCl 
by any method may provide opportunity for decreasing apoptosis and allowing 
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increased proliferation of cells in conditions where cell growth is limited. Such 
disorders include but are not limited to neurodeficiencies and neurodegeneration, 
and growth disorders. ABCl could, therefore, potentially be used as a method for 
identification of compounds for use in the treatment of cancer, or in the treatment 
of degenerative disorders. 

Agents that have been shown to inhibit ABCl include, for example, the 
anti-diabetic agents glibenclamide and glyburide, flufenamic acid, diphenylamine- 
2-carbonic acid, sulfobromophthalein, and DIDS. 

Agents that upregulate ABCl expression or biological activity include but 
are not limited to protein kinase A, protein kinase C, vanadate, okadaic acid, and 
IBMXl. 

Those in the art will recognize that other compounds can also modulate 
ABCl biological activity, and these compounds are also in the spirit of the 
invention. 

Drug screens based on the ABCl gene or protein 

The ABCl protein and gene can be used in screening assays for 
identification of compounds which modulate its activity and may be potential 
drugs to regulate cholesterol levels. Useful ABCl proteins include wild-type and 
mutant ABCl proteins or protein fi-agments, in a recombinant form or 
endogenously expressed. Drug screens to identify compounds acting on the ABCl 
expression product may employ any functional feature of the protein. In one 
example, the phosphorylation state or other post-translational modification is 
monitored as a measure of ABCl biological activity. ABCl has ATP binding 
sites, and thus assays may wholly or in part test the ability of ABCl to bind ATP 
or to exhibit ATPase activity. ABCl, by analogy to similar proteins, is thought to 
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be able to form a channel-like stracture; drug screening assays could be based 
upon assaying for the ability of the protein to form a channel, or upon the ability to 
transport cholesterol or another molecule, or based upon the ability of other 
proteins bound by or regulated by ABCl to form a channel. Alternatively, 
phospholipid or lipid transport can also be used as measures of ABCl biological 
activity. 

There is evidence that, in addition to its role as a regulator of cholesterol 
levels, ABCl also transports anions. Functional assays could be based upon this 
property, and could employ drug screening technology such as (but not limited to) 
the ability of various dyes to change color in response to changes in specific ion 
concentrations in such assays can be performed in vesicles such as liposomes, or 
adapted to use whole cells. 

Drug screening assays can also be based upon the ability of ABCl or other 
ABC transporters to interact with other proteins. Such interacting proteins can be 
identified by a variety of methods known in the art, including, for example, 
radioimmunoprecipitation, co-immunoprecipitation, co-purification, and yeast 
two-hybrid screening. Such interactions can be further assayed by means 
including but not limited to fluorescence polarization or scintillation proximity 
methods. Drug screens can also be based upon functions of the ABCl protein 
deduced upon X-ray crystallography of the protein and comparison of its 3-D 
structure to that of proteins with known functions. Such a crystal structure has 
been determined for the prokaryotic ABC family member HisP, histidine 
permease. Drug screens can be based upon a function or feature apparent upon 
creation of a transgenic or knockout mouse, or upon overexpression of the protein 
or protein fi^agment in mammalian cells in vitro. Moreover, expression of 
mammalian (e.g., human) ABCl in yeast or C. elegans allows for screening of 
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candidate compounds in wild-type and mutant backgrounds, as well as screens for 
mutations that enhance or suppress an ABC 1 -dependent phenotype. Modifier 
screens can also be performed in ABCl transgenic or knock-out mice. 

Additionally, drug screening assays can also be based upon ABCl functions 
deduced upon antisense interference with the gene function. Intracellular 
localization of ABCl, or effects which occur upon a change in intracellular 
localization of the protein, can also be used as an assay for drug screening. 
Immunocytochemical methods will be used to determine the exact location of the 
ABCl protein. 

Human and rodent ABCl protein can be used as an antigen to raise 
antibodies, including monoclonal antibodies. Such antibodies will be useful for a 
wide variety of purposes, including but not limited to fiinctional studies and the 
development of drug screening assays and diagnostics. Monitoring the influence 
of agents (e.g., drugs, compounds) on the expression or biological activity of 
ABCl can be applied not only in basic drug screening, but also in clinical trials. 
For example, the effectiveness of an agent determined by a screening assay as 
described herein to increase ABCl gene expression, protein levels, or biological 
activity can be monitored in clinical trails of subjects exhibiting altered ABCl 
gene expression, protein levels, or biological activity. Alternatively, the 
effectiveness of an agent determined by a screening assay to modulate ABCl gene 
expression, protein levels, or biological activity can be monitored in clinical trails 
of subjects exhibiting decreased altered gene expression, protein levels, or 
biological activity. In such clinical trials, the expression or activity of ABCl and, 
preferably, other genes that have been implicated in, for example, cardiovascular 
disease can be used to ascertain the effectiveness of a particular drug. 

For example, and not by way of limitation, genes, including ABCl, that are 
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modulated in cells by treatment with an agent (e.g., compound, drug or small 
molecule) that modulates ABCl biological activity (e.g., identified in a screening 
assay as described herein) can be identified. Thus, to study the effect of agents on 
cholesterol levels or cardiovascular disease, for example, in a clinical trial, cells 
can be isolated and RNA prepared and analyzed for the levels of expression of 
ABCl and other genes implicated in the disorder. The levels of gene expression 
can be quantified by Northern blot analysis or RT-PCR, or, alternatively, by 
measuring the amount of protein produced, by one of a number of methods known 
in the art, or by measuring the levels of biological activity of ABCl or other genes. 
In this way, the gene expression can serve as a marker, indicative of the 
physiological response of the cells to the agent. Accordingly, this response state 
may be determined before, and at various points during, treatment of the individual 
with the agent. 

In a preferred embodiment, the present invention provides a method for 
monitoring the effectiveness of treatment of a subject with an agent (e.g., an 
agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, 
or other drug candidate identified by the screening assays described herein) 
including the steps of (i) obtaining a pre-administration sample fi^om a subject 
prior to administration of the agent; (ii) detecting the level of expression of an 
ABCl protein, mRNA, or genomic DNA in the preadministration sample; (iii) 
obtaining one or more post-administration samples from the subject; (iv) detecting 
the level of expression or activity of the ABCl protein, mRNA, or genomic DNA 
in the post-administration samples; (v) comparing the level of expression or 
activity of the ABCl protein, mRNA, or genomic DNA in the pre-administration 
sample with the ABCl protein, mRNA, or genomic DNA in the post 
administration sample or samples; and (vi) altering the administration of the agent 
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to the subject accordingly. For example, increased administration of the agent may 
be desirable to increase the expression or activity of ABC 1 to higher levels than 
detected, i.e., to increase the effectiveness of the agent. Alternatively, decreased 
administration of the agent may be desirable to decrease expression or activity of 
ABCl to lower levels than detected. 

The ABCl gene or a fragment thereof can be used as a tool to express the 
protein in an appropriate cell in vitro or in vivo (gene therapy), or can be cloned 
into expression vectors which can be used to produce large enough amounts of 
ABCl protein to use in in vitro assays for drug screening. Expression systems 
which may be employed include baculovirus, herpes virus, adenovirus, adeno- 
associated virus, bacterial systems, and eucaryotic systems such as CHO cells. 
Naked DNA and DNA-liposome complexes can also be used. 

Assays of ABCl activity includes binding to intracellular interacting 
proteins; interaction with a protein that up-regulates ABCl activity; interaction 
with HDL particles or constituents; interaction with other proteins which facilitate 
interaction with HDL or its constituents; and measurement of cholesterol efflux. 
Furthermore, assays may be based upon the molecular dynamics of 
macromolecules, metabolites and ions by means of fluorescent-protein biosensors. 

Altematively, the effect of candidate modulators on expression or activity 
may be measured at the level of ABCl protein production using the same general 
approach in combination with standard immunological detection techniques, such 
as Western blotting or immunoprecipitation with an ABCl -specific antibody. 
Again, useful cholesterol-regulating or anti-CVD therapeutic modulators are 
identified as those which produce an change in ABCl polypeptide production. 
Agonists may also affect ABCl activity without any effect on expression level. 

Candidate modulators may be purified (or substantially purified) molecules 
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or may be one component of a mixture of compounds (e.g., an extract or 
supematant obtained from cells). In a mixed compound assay, ABCl expression is 
tested against progressively smaller subsets of the candidate compound pool (e.g., 
produced by standard purification techniques, e.g., HPLC or FPLC; Ausubel et al) 
until a single compound or minimal compound mixture is demonstrated to 
modulate ABCl expression. 

Agonists, antagonists, or mimetics found to be effective at modulating the 
level of cellular ABCl expression or activity may be confirmed as useful in animal 
models (for example, mice, pigs, rabbits, or chickens). For example, the 
compound may ameliorate the low HDL levels of mouse or chicken 
h5^oalphalipoproteinemias . 

A compound that promotes an increase in ABCl expression or activity is 
considered particularly useful in the invention; such a molecule may be used, for 
example, as a therapeutic to increase the level or activity of native, cellular ABCl 
and thereby treat a low HDL condition in an animal (for example, a human). 

One method for increasing ABC biological activity is to increase the 
stabilization of the ABC protein or to prevent its degradation. Thus, it would be 
useful to identify mutations in an ABC polypeptide (e.g., ABCl) that lead to 
increased protein stability. These mutations can be incorporated into any protein 
therapy or gene therapy undertaken for the treatment of low HDL-C or any other 
condition resulting from loss of ABCl biological activity. Similarly, compounds 
that increase the stability of a wild-type ABC polypeptide or decrease its 
catabolism may also be useful for the treatment of low HDL-C or any other 
condition resulting from loss of ABCl biological activity. Such mutations and 
compounds can be identified using the methods described herein. 

In one example, cells expressing an ABC polypeptide having a mutation are 
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transiently metabolically labeled during translation and the half-life of the ABC 
polypeptide is determined using standard techniques. Mutations that increase the 
half-life of an ABC polypeptide are ones that increase ABC protein stability. 
These mutations can then be assessed for ABC biological activity. They can also 
be used to identify proteins that affect the stability of ABC 1 mRNA or protein. 
One can then assay for compounds that act on these factors or on the ability of 
these factors to bind ABC 1 . 

In another example, cells expressing wild-type ABC polypeptide are 
transiently metabolically labeled during translation, contacted with a candidate 
compounds, and the half-life of the ABC polypeptide is determined using standard 
techniques. Compounds that increase the half-life of an ABC polypeptide are 
useful compounds in the present invention. 

If desired, treatment with an agonist of the invention may be combined with 
any other HDL-raising or anti-CVD therapies. 

It is understood that, while ABCl is the preferred ABC transporter for the 
drug screens described herein, other ABC transporters can also be used. The 
replacement of ABCl with another ABC transporter is possible because it is likely 
that ABC transporter family members, such as ABC2, ABCR, or ABC8 will have 
a similar mechanism of regulation. 

Exemplary assays are described in greater detail below. 

Protein-based assays 

ABCl polypeptide (purified or unpurified) can be used in an assay to 
determine its ability to bind another protein (including, but not limited to, proteins 
found to specifically interact with ABCl). The effect of a compound on that 
binding is then determined. 
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Protein Interaction Assays 

ABCl protein (or a polypeptide fragment thereof or an epitope-tagged form 
or fragment thereof) is harvested from a suitable source (e.g., from a prokaryotic 
expression system, eukaryotic cells, a cell- free system, or by immunoprecipitation 
from ABCl -expressing cells). The ABCl polypeptide is then bound to a suitable 
support (e.g., nitrocellulose or an antibody or a metal agarose column in the case 
of, for example, a his-tagged form of ABCl). Binding to the support is preferably 
done under conditions that allow proteins associated with ABCl polypeptide to 
remain associated with it. Such conditions may include use of buffers that 
minimize interference with protein-protein interactions. The binding step can be 
done in the presence and absence of compounds being tested for their ability to 
interfere with interactions between ABCl and other molecules. If desired, other 
proteins (e.g., a cell lysate) are added, and allowed time to associate with the ABC 
polypeptide. The immobilized ABCl polypeptide is then washed to remove 
proteins or other cell constituents that may be non-specifically associated with it 
the polypeptide or the support. The immobilized ABCl polypeptide is then 
dissociated from its support, and so that proteins bound to it are released (for 
example, by heating), or, alternatively, associated proteins are released from ABCl 
without releasing the ABCl polypeptide from the support. The released proteins 
and other cell constituents can be analyzed, for example, by SDS-PAGE gel 
electrophoresis, Westem blotting and detection with specific antibodies, 
phosphoamino acid analysis, protease digestion, protein sequencing, or isoelectric 
focusing. Normal and mutant forms of ABCl can be employed in these assays to 
gain additional information about which part of ABCl a given factor is binding to. 
In addition, when incompletely purified polypeptide is employed, comparison of 
the normal and muatant forms of the protein can be used to help distinguish true 
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binding proteins. 

The foregoing assay can be performed using a purified or semipurified 
protein or other molecule that is known to interact with ABCl . This assay may 
include the following steps. 

1 . Harvest ABCl protein and couple a suitable fluorescent label to it; 

2. Label an interacting protein (or other molecule) with a second, different 
fluorescent label. Use dyes that will produce different quenching pattems when 
they are in close proximity to each other vs. when they are physically separate (i.e., 
dyes that quench each other when they are close together but fluoresce when they 
are not in close proximity); 

3. Expose the interacting molecule to the immobilized ABCl in the 
presence or absence of a compound being tested for its ability to interfere with an 
interaction between the two; and 

4. Collect fluorescent readout data. 

Another assay is includes Fluorescent Resonance Energy Transfer (FRET) 
assay. This assay can be performed as follows. 

1 . Provide ABCl protein or a suitable polypeptide fragment thereof and 
couple a suitable FRET donor (e.g.,. nitro-benzoxadiazole (NBD)) to it; 

2. Label an interacting protein (or other molecule) with a FRET acceptor 
(e.g., rhodamine); 

3. Expose the acceptor-labeled interacting molecule to the donor-labeled 
ABCl in the presence or absence of a compound being tested for its ability to 
interfere with an interaction between the two; and 

4. Measure fluorescence resonance energy transfer. 
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Quenching and FRET assays are related. Either one can be appHed in a 
given case, depending on which pair of fluorophores is used in the assay. 

Membrane permeabihty assay 

The ABCl protein can also be tested for its effects on membrane 
permeability. For example, beyond its putative ability to translocate lipids, ABCl 
might affect the permeability of membranes to ions. Other related membrane 
proteins, most notably the cystic fibrosis transmembrane conductance regulator 
and the sulfonylurea receptor, are associated with and regulate ion channels. 

ABCl or a fragment of ABCl is incorporated into a synthetic vesicle, or, 
altematively, is expressed in a cell and vesicles or other cell sub-structures 
containing ABCl are isolated. The ABCl -containing vesicles or cells are loaded 
with a reporter molecule (such as a fluorescent ion indicator whose fluorescent 
properties change when it binds a particular ion) that can detect ions (to observe 
outward movement), or altematively, the extemal medium is loaded with such a 
molecule (to observe inward movement). A molecule which exhibits differential 
properties when it is inside the vesicle compared to when it is outside the vesicle is 
preferred. For example, a molecule that has quenching properties when it is at 
high concentration but not when it is at another low concentration would be 
suitable. The movement of the charged molecule (either its ability to move or the 
kinetics of its movement) in the presence or absence of a compound being tested 
for its ability to affect this process can be determined. 

In another assay, membrane permeability is determined electro- 
physiologically by measuring ionic influx or efflux mediated by or modulated by 
ABCl by standard electrophysiological techniques. A suitable control (e.g., TD 
cells or a cell line with very low endogenous ABCl expression) can be used as a 
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control in the assay to determine if the effect observed is specific to cells 
expressing ABCl. 

In still another assay, uptake of radioactive isotopes into or out of a vesicle 
can be measured. The vesicles are separated from the extravesicular medium and 
the radioactivity in the vesicles and in the medium is quantitated and compared. 

Nucleic acid-based assays 

ABCl nucleic acid may be used in an assay based on the binding of factors 
necessary for ABCl gene transcription. The association between the ABCl UNA 
and the binding factor may be assessed by means of any system that discriminates 
between protein-bound and non-protein-bound DNA (e.g., a gel retardation assay). 
The effect of a compound on the binding of a factor to ABCl DNA is assessed by 
means of such an assay. In addition to in vitro binding assays, in vivo assays in 
which the regulatory regions of the ABCl gene are linked to reporter genes can 
also be performed. 

Assays measuring ABCl stability 

A cell-based or cell-free system can be used to screen for compounds based 
on their effect on the half-life of ABCl mRNA or ABCl protein. The assay may 
employ labeled mRNA or protein. Alternatively, ABCl mRNA may be detected 
by means of specifically hybridizing probes or a quantitative PCR assay. Protein 
can be quantitated, for example, by fluorescent antibody-based methods. 

In vitro mRNA stability assay 

1. Isolate or produce, by in vitro transcription, a suitable quantity of ABCl 
mRNA; 
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2. Label the ABCl mRNA; 

3. Expose aliquots of the mRNA to a cell lysate in the presence or absence 
of a compound being tested for its ability to modulate ABCl mRNA stability; 

4. Assess intactness of the remaining mRNA at suitable time points. 

In vitro protein stability assay 

1 . Express a suitable amount of ABCl protein; 

2. Label the protein; 

3. Expose aliquots of the labeled protein to a cell lysate in the presence or 
absence of a compound being tested for its ability to modulate ABCl protein 
stability; 

4. Assess intactness of the remaining protein at suitable time points 
In vivo mRNA or protein stability assay 

1 . Incubate cells expressing ABCl mRNA or protein with a tracer 
(radiolabeled ribonucleotide or radiolabeled amino acid, respectively) for a very 
brief time period (e.g., five minutes) in the presence or absence of a compound 
being tested for its effect on mRNA or protein stability; 

2. Incubate with unlabeled ribonucleotide or amino acid; and 

3. Quantitate the ABCl mRNA or protein radioactivity at time intervals 
beginning with the start of step 2 and extending to the time when the radioactivity 
in ABCl mRNA or protein has declined by approximately 80%. It is preferable to 
separate the intact or mostly intact mRNA or protein from its radioactive 
breakdown products by a means such as gel electrophoresis in order to quantitate 
the mRNA or protein. 
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Assays measuring inhibition of dominant negative activity 

Mutant ABCl polypeptides are likely to have dominant negative activity 
(i.e., activity that interferes with wild-type ABCl function). An assay for a 
compound that can interfere with such a mutant may be based on any method of 
5 quantitating normal ABCl activity in the presence of the mutant. For example, 

normal ABCl facilitates cholesterol efflux, and a dominant negative mutant would 
interfere with this effect. The ability of a compound to counteract the effect of a 
dominant negative mutant may be based on cellular cholesterol efflux, or on any 
other normal activity of the wild-type ABCl that was inhibitable by the mutant. 

10 

Assays measuring phosphorylation 

The effect of a compound on ABCl phosphorylation can be assayed by 
methods that quantitate phosphates on proteins or that assess the phosphorylation 
state of a specific residue of a ABCl. Such methods include but are not limited to 
15 ^^P labelling and immunoprecipitation, detection with antiphosphoamino acid 
antibodies (e.g., antiphosphoserine antibodies), phosphoamino acid analysis on 
2-dimensional TLC plates, and protease digestion fingerprinting of proteins 
followed by detection of ^^P-labeled fragments. 

20 Assays measuring other post-translational modifications 

The effect of a compound on the post-translational modification of ABCl is 
based on any method capable of quantitating that particular modification. For 
example, effects of compounds on glycosylation may be assayed by treating ABCl 
with glycosylase and quantitating the amount and nature of carbohydrate released. 

25 
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The ability of ABC 1 to bind ATP provides another assay to screen for 
compounds that affect ABCl . ATP binding can be quantitated as follows. 

1 . Provide ABCl protein at an appropriate level of purity and reconsititute 
it in a lipid vesicle; 

2. Expose the vesicle to a labeled but non-hydrolyzable ATP analog (such 
as gamma ^^S-ATP) in the presence or absence of compounds being tested for their 
effect on ATP binding. Note that azido-ATP analogs can be used to allow 
covalent attachment of the azido-ATP to protein (by means of U.V. light), and 
permit easier quantitation of the amount of ATP bound to the protein. 

3. Quantitate the amount of ATP analog associated with ABCl 

Assays measuring ATPase activity 

Quantitation of the ATPase activity of ABCl can also be assayed for the 
effect of compounds on ABCl . This is preferably performed in a cell-j&ee assay 
so as to separate ABCl from the many other ATPases in the cell An ATPase 
assay may be performed in the presence or absence of membranes, and with or 
without integration of ABCl protein into a membrane. If performed in a 
vesicle-based assay, the ATP hydrolysis products produced or the ATP hydrolyzed 
may be measured within or outside of the vesicles, or both. Such an assay may be 
based on disappearance of ATP or appearance of ATP hydrolysis products. 

For high-throughput screening, a coupled ATPase assay is preferable. For 
example, a reaction mixture containing pyruvate kinase and lactate dehydrogenase 
can be used. The mixture includes phosphoenolpymvate (PEP), nicotinamide 
adenine dinucleotide (NAD+), and ATP. The ATPase activity of ABCl generates 
ADP from ATP. The ADP is then converted back to ATP as part of the pyruvate 
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kinase reaction. The product, pyruvate, is then converted to lactate. The latter 
reaction generates a colored quinone (NADH) from a colorless substrate (NAD+), 
and the entire reaction can be monitored by detection of the color change upon 
formation of NADH. Since ADP is limiting for the pyruvate kinase reaction, this 
coupled system precisely monitors the ATPase activity of ABC 1. 

Assays measuring cholesterol efflux 

A transport-based assay can be performed in vivo or in vitro. For example, 
the assay may be based on any part of the reverse cholesterol transport process that 
is readily re-created in culture, such as cholesterol or phospholipid efflux. 
Altematively, the assay may be based on net cholesterol transport in a whole 
organism, as assessed by means of a labeled substance (such as cholesterol). 

For high throughput, fluorescent lipids can be used to measure 
ABC 1 -catalyzed lipid efflux. For phospholipids, a fluorescent precursor, 
C6-NBD-phosphatidic acid, can be used. This lipid is taken up by cells and 
dephosphorylated by phosphatidic acid phosphohydrolase. The product, 
NBD-diglyceride, is then a precursor for synthesis of glycerophospholipids like 
phosphatidylcholine. The efflux of NBD-phosphatidylcholine can be monitored 
by detecting fluorescence resonance energy transfer (FRET) of the NBD to a 
suitable acceptor in the cell culture medium. This acceptor can be 
rhodamine-labeled phosphatidylethanolamine, a phospholipid that is not readily 
taken up by cells. The use of short-chain precursors obviates the requirement for 
the phospholipid transfer protein in the media. For cholesterol, NBD-cholesterol 
ester can be reconstituted into LDL. The LDL can efficientiy deliver this lipid to 
cells via the LDL receptor pathway. The NBD-cholesterol esters are hydrolyzed in 
the lysosomes, resulting in NBD-cholesterol that can now be transported back to 
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the plasma membrane and efflux from the cell. The efflux can be monitored by 
the aforementioned FRET assay in which NBD transfers its fluorescence 
resonance energy to the rhodamine-phosphatidylethanoline acceptor. 

Animal Model Systems 

Compounds identified as having activity in any of the above-described 
assays are subsequently screened in any available animal model system, including, 
but not limited to, pigs, rabbits, and WHAM chickens. Test compounds are 
administered to these animals according to standard methods. Test compounds 
may also be tested in mice bearing mutations in the ABCl gene. Additionally, 
compounds may be screened for their ability to enhance an interaction between 
ABCl and any HDL particle constituent such as ApoAI, ApoAII, or ApoE. 

The cholesterol efflux assay as a drug screen 

The cholesterol efflux assay measures the ability of cells to transfer 
cholesterol to an extracellular acceptor molecule and is dependent on ABCl 
function. In this procedure, cells are loaded with radiolabeled cholesterol by any 
of several biochemical pathways (Marcil et al, Arterioscler. Thromb. Vase. Biol. 
19:159-169, 1999). Cholesterol efflux is then measured after incubation for 
various times (typically 0 to 24 hours) in the presence of HDL3 or purified ApoAI. 
Cholesterol efflux is determined as the percentage of total cholesterol in the culture 
medium after various times of incubation. ABCl expression levels and/or 
biological activity are associated with increased efflux while decreased levels of 
ABCl are associated with decreased cholesterol efflux. 

This assay can be readily adapted to the format used for drug screening, 
which may consist of a multi-well (e.g., 96-well) format. Modification of the 
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assay to optimize it for drag screening would include scaling down and 
streamlining the procedure, modifying the labeling method, using a different 
cholesterol acceptor, altering the incubation time, and changing the method of 
calculating cholesterol efflux. In all these cases, the cholesterol efflux assay 
remains conceptually the same, though experimental modifications may be made. 
A transgenic mouse overexpressing ABCl would be expected to have higher than 
normal HDL levels. 

Knock-out mouse model 

An animal, such as a mouse, that has had one or both ABCl alleles 
inactivated (e.g., by homologous recombination) is likely to have low HDL-C 
levels, and thus is a preferred animal model for screening for compounds that raise 
HDL-C levels. Such an animal can be produced using standard techniques. In 
addition to the initial screening of test compounds, the animals having mutant 
ABCl genes are useful for further testing of efficacy and safety of drugs or agents 
first identified using one of the other screening methods described herein. Cells 
taken from the animal and placed in culture can also be exposed to test 
compounds. HDL-C levels can be measured using standard techniques, such as 
those described herein. 

WHAM chickens: an animal model for low HDL cholesterol 

Wisconsin Hypo- Alpha Mutant (WHAM) chickens arose by spontaneous 
mutation in a closed flock. Mutant chickens came to attention through their a Z- 
linked white shank and white beak phenotype referred to as 'recessive white skin' 
(McGibbon, 1981) and were subsequently found to have a profound deficiency of 
HDL (Poemama et al, 1990). 
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This chicken low HDL locus (Y) is Z-linked, or sex-linked. (In birds, 
females are ZW and males are ZZ). Genetic mapping placed the Y locus on the 
long arm of the Z chromosome (Bitgood, 1985), proximal to the ID locus 
(Bitgood, 1988). Examination of current public mapping data for the chicken 
genome mapping project, ChickMap (maintained by the Roslin Institute; 
http://www.ri.bbsrc.ac.uk/ chickmap/ChickMapHomePage.html) showed that a 
region of synteny with human chromosome 9 lies on the long arm of the chicken Z 
chromosome (Zq) proximal to the ID locus. Evidence for this region of synteny is 
the location of the chicken aldolase B locus (ALDOB) within this region. The 
human ALDOB locus maps to chromosome 9q22.3 (The Genome Database, 
http://gdbwww.gdb.org/), not far from the location of human ABCl. This 
comparison of maps showed that the chicken Zq region near chicken ALDOB and 
the human 9q region near human ALDOB represent a region of synteny between 
human and chicken. 

Since a low HDL locus maps to the 9q location in humans and to the Zq 
region in chickens, these low HDL loci are most probably located within the 
syntenic region. Thus we predicted that ABCl is mutated in WHAM chickens. In 
support of this, we have identified an E"4>K mutation at a position that corresponds 
to amino acid 89 of human ABCl (Figs. 14 and 15). This non-conservative 
substitution is at a position that is conserved among human, mouse, and chicken, 
indicating that it is in a region of the protein likely to be of ftmctional importance. 

Discovery of the WHAM mutation in the amino-terminal portion of the 
ABCl protein also establishes the importance of the amino-terminal region. This 
region may be critical because of association with other proteins required to carry 
out cholesterol efflux or related tasks. It may be an important regulatory region 
(there is a phosphorylation site for casein kinase near the mutated residue), or it 
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may help to dictate a precise topological relationship with cellular membranes (the 
N-terminal 60 amino acid region contains a putative membrane-spanning or 
membrane-associated segment). 

The amino-terminal region of the protein (up to the first 6-TM region at 
5 approximately amino acid 639) is an ideal tool for screening factors that affect 

ABCl activity. It can be expressed as a truncated protein in ABCl wild type cells 
in order to test for interference of the normal ABCl function by the truncated 
protein. If the fragment acts in a dominant negative way^ it could be used in 
immunoprecipitations to identify proteins that it may be competing away from the 

10 normal endogenous protein. 

The C- terminus also lends itself to such experiments ^ as do the intracellular 
portions of the molecule, expressed as fragments or tagged or fusion proteins, in 
the absence of transmembrane regions. 

Since it is possible that there are several genes in the human genome which 

15 affect cholesterol efflux, it is important to establish that any animal model to be 
used for a human genetic disease represents the homologous locus in that animal, 
and not a different locus with a similar function. The evidence above establishes 
that the chicken Y locus and the human chromosome 9 low HDL locus are 
homologous. WHAM chickens are therefore an important animal model for the 

20 identification of dmgs that modulate cholesterol efflux. 

The WHAM chickens' HDL deficiency syndrome is not, however, 
associated with an increased susceptibility to atherosclerosis in chickens. This 
probably reflects the shorter lifespan of the chicken rather than an inherent 
difference in the function of the chicken ABCl gene compared to the human gene. 

25 We propose the WHAM chicken as a model for human low HDL for the 

development and testing of drugs to raise HDL in humans. Such a model could be 
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employed in several forms, through the use of cells or other derivatives of these 
chickens, or by the use of the chickens themselves in tests of drug effectiveness, 
toxicity, and other drug development purposes. 

Therapy 

Compounds of the invention, including but not limited to, ABCl 
polypeptides, ABCl nucleic acids, other ABC transporters, and any therapeutic 
agent that modulates biological activity or expression of ABCl identified using 
any of the methods disclosed herein, may be administered with a 
pharmaceutically-acceptable diluent, carrier, or excipient, in unit dosage form. 
Conventional pharmaceutical practice may be employed to provide suitable 
formulations or compositions to administer such compositions to patients. 
Although intravenous administration is preferred, any appropriate route of 
administration may be employed, for example, perenteral, subcutaneous, 
intramuscular, intracranial, intraorbital, ophthalmic, intraventricular, intracapsular, 
intraspinal, intracistemal, intraperitoneal, intranasal, aerosol, or oral 
administration. Therapeutic formulations may be in the form of liquid solutions or 
suspension; for oral administration, formulations may be in the form of tablets or 
capsules; and for intranasal formulations, in the form of powders, nasal drops, or 
aerosols. 

Methods well known in the art for making formulations are found in, for 
example, Remington: The Science and Practice of Pharmacy , (19th ed.) ed. A.R. 
Gennaro AR., 1995, Mack Publishing Company, Easton, PA. Formulations for 
parenteral administration may, for example, contain excipients, sterile water, or 
saline, polyalkylene glycols such as polyethylene glycol, oils of vegetable origin, 
or hydrogenated napthalenes. Biocompatible, biodegradable lactide polymer, 
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lactide/glycolide copolymer, or polyoxyethylene-polyoxypropylene copolymers 
may be used to control the release of the compounds. Other potentially useful 
parenteral delivery systems for agonists of the invention include ethylenevinyl 
acetate copolymer particles, osmotic pumps, implantable infusion systems, and 
liposomes. Formulations for inhalation may contain excipients, or example, 
lactose, or may be aqueous solutions containing, for example, polyoxyethylene-9- 
lauryl ether, glycocholate and deoxycholate, or may be oily solutions for 
administration in the form of nasal drops, or as a gel. 

Compounds 

In general, novel drugs for the treatment of aberrant cholesterol levels 
and/or CVD are identified fi'om large libraries of both natural product or synthetic 
(or semi-synthetic) extracts or chemical libraries according to methods known in 
the art. Those skilled in the field or drug discovery and development will 
understand that the precise source of test extracts or compounds is not critical to 
the screening procedure(s) of the invention. Accordingly, virtually any number of 
chemical extracts or compounds can be screened using the exemplary methods 
described herein. Examples of such extracts or compounds include, but are not 
limited to, plant-, fiangal-, prokaryotic- or animal-based extracts, fermentation 
broths, and synthetic compounds, as well as modification of existing compounds. 
Numerous methods are also available for generating random or directed synthesis 
(e.g., semi-synthesis or total synthesis) of any number of chemical compounds, 
including, but not limited to, saccharide-, lipid-, peptide-, and nucleic acid-based 
compounds. Synthetic compound libraries are commercially available from 
Brandon Associates (Merrimack, NH) and Aldrich Chemical (Milwaukee, WI). 
Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant. 
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and animal extracts are commercially available from a number of sources, 
including Biotics (Sussex, UK), Xenova (Slough, UK), Harbor Branch 
Oceangraphics Institute (Ft. Pierce, FL), and PharmaMar, U.S.A. (Cambridge, 
MA). In addition, natural and synthetically produced libraries are produced, if 

5 desired, according to methods known in the art, e.g., by standard extraction and 

fractionation methods. Furthermore, if desired, any library or compound is readily 
modified using standard chemical, physical, or biochemical methods. 

In addition, those skilled in the art of drug discovery and development 
readily understand that methods for dereplication (e.g., taxonomic dereplication, 

10 biological dereplication, and chemical dereplication, or any combination thereof) 
or the elimination of replicates or repeats of materials already known for their 
HDL-raising and anti-CVD activities should be employed whenever possible. 

When a crude extract is found to have cholesterol-modulating or anti-CVD 
activities or both, further fractionation of the positive lead extract is necessary to 

15 isolate chemical constituent responsible for the observed effect. Thus, the goal of 
the extraction, fractionation, and purification process is the careful characterization 
and identification of a chemical entity within the crude extract having cholesterol- 
modulating or anti-CVD activities. The same in vivo and in vitro assays described 
herein for the detection of activities in mixtures of compounds can be used to 

20 purify the active component and to test derivatives thereof Methods of 

fractionation and purification of such heterogeneous extracts are known in the art. 
If desired, compounds shown to be useful agents for the treatment of pathogenicity 
are chemically modified according to methods known in the art. Compounds 
identified as being of therapeutic value are subsequently analyzed using any 

25 standard animal model of diabetes or obesity known in the art. 

It is understood that compounds that modulate activity of proteins that 
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modulate or are modulated by ABCl are useful compounds for modulating 
cholesterol levels. Exemplary compounds are provided herein; others are known 
in the art. 

Compounds that are structurally related to cholesterol, or that mimic ApoAI 
or a related apolipoprotein, and increase ABCl biological activity are particularly 
useful compounds in the invention. Other compounds, known to act on the MDR 
protein, can also be used or derivatized and assayed for their ability to increase 
ABCl biological activity. Exemplary MDR modulators are PSC833, 
bromocriptine, and cyclosporin A. 

Screening patients having low HDL-C 

ABCl expression, biological activity, and mutational analysis can each 
serve as a diagnostic tool for low HDL; thus determination of the genetic 
subtyping of the ABCl gene sequence can be used to subtype low HDL individuals 
or families to determine whether the low HDL phenotype is related to ABCl 
function. This diagnostic process can lead to the tailoring of drug treatments 
according to patient genotype, including prediction of side effects upon 
administration of HDL increasing drugs (referred to herein as pharmacogenomics). 
Pharmacogenomics allows for the selection of agents (e.g., drugs) for therapeutic 
or prophylactic treatment of an individual based on the genotype of the individual 
(e.g., the genotype of the individual is examined to determine the ability of the 
individual to respond to a particular agent). 

Agents, or modulators which have a stimulatory or inhibitory effect on 
ABCl biological activity or gene expression can be administered to individuals to 
treat disorders (e.g., cardiovascular disease or low HDL cholesterol) associated 
with aberrant ABCl activity. In conjunction with such treatment, the 



-79- 



pharmacogenomics (i.e., the study of the relationship between an individual's 
genotype and that individual's response to a foreign compound or drug) of the 
individual may be considered. Differences in efficacy of therapeutics can lead to 
severe toxicity or therapeutic failure by altering the relation between dose and 
blood concentration of the pharmacologically active drug. Thus, the 
pharmacogenomics of the individual permits the selection of effective agents (e.g., 
drugs) for prophylactic or therapeutic treatments based on a consideration of the 
individual's genotype. Such pharmacogenomics can further be used to determine 
appropriate dosages and therapeutic regimens. Accordingly, the activity of ABCl 
protein, expression of ABCl nucleic acid, or mutation content of ABCl genes in 
an individual can be determined to thereby select appropriate agent(s) for 
therapeutic or prophylactic treatment of the individual. 

Pharmacogenomics deals with clinically significant hereditary variations in 
the response to drugs due to altered drug disposition and abnormal action in 
affected persons (Eichelbaum, M., Clin. Exp. Pharmacol. Physiol., 23:983-985, 
1996; Under, M. W., Clin. Chem., 43:254-266, 1997). In general, two types of 
pharmacogenetic conditions can be differentiated. Genetic conditions transmitted 
as a single factor altering the way drugs act on the body (altered drug action) or 
genetic conditions transmitted as single factors altering the way the body acts on 
drugs (altered drug metabolism). Altered drug action may occur in a patient 
having a polymorphism (e.g., an single nucleotide polymorphism or SNP) in 
promoter, intronic, or exonic sequences of ABCl. Thus by determining the 
presence and prevalence of polymorphisms allow for prediction of a patient's 
response to a particular therapeutic agent. In particular, polymorphisms in the 
promoter region may be critical in determining the risk of HDL deficiency and 
CVD. 



-80- 



In addition to the mutations in the ABCl gene described herein, we have 
detected polymorphisms in the human ABCl gene (Fig. 11). These polymorphisms 
are located in promoter, intronic, and exonic sequence of ABCl . Using standard 
methods, such as direct sequencing, PGR, SSCP, or any other polymorphism- 
detection system, one could easily ascertain whether these polymorphisms are 
present in a patient prior to the establishment of a drug treatment regimen for a 
patient having low HDL, cardiovascular disease, or any other ABCl -mediated 
condition. It is possible that some these polymorphisms are, in fact, weak 
mutations. Individuals harboring such mutations may have an increased risk for 
cardiovascular disease; thus, these polymorphisms may also be useful in diagnostic 
assays. 

Association Studies of ABCl Gene Variants and HDL Levels or Cardiovascular 
Disease 

The following polymorphisms have been examined for their effect on 
cholesterol regulation and the predisposition for the development of cardiovascular 
disease. 

Substitution of G for A at nucleotide -1045 [G(-1045)AJ. This variant is in 
complete linkage disequilibrium with the variant at -738 in the individuals we have 
sequenced, and thus any potential phenotypic effects currently attributed to the 
variant at -738 may at least in part be due to changes at this site. 

Substitution of G for A at nucleotide -738 [G(-738)A], This variant has 
been found at very high frequencies in populations selected for low HDL 
cholesterol or premature coronary artery disease. 

Insertion of aG nucleotide at position -4 [G ins (-4)], This variant has been 
associated with less coronary artery disease in its carriers than in non-carriers. 
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Substitution of aC for G at nucleotide -57 [ G(-57)C]. This variant is in 
complete linkage disequilibrium with the variant at -4 in the individuals we have 
sequenced, and thus the phenotypic effects currently attributed to the variant at -4 
may at least in part be due to changes at this site. 
5 Substitution of A for G at nucleotide 730 (R219K), We have found carriers 

to have significantly less cardiovascular disease. 

Substitution of C for T at nucleotide 1270 (V399A). Within the French 
Canadian population, this variant has only been found in individuals from the low 
HDL population. It has also been seen in individuals with low HDL or premature 
10 coronary artery disease in individuals of Dutch ancestry. 

Substitution of A for G at nucleotide 2385 (V771M), This variant has been 
found at an increased frequency in a Dutch population selected for low HDL and at 
an increased frequency in a population selected for premature coronary artery 
disease compared to a control Dutch population, indicating carriers of this variant 
15 may have reduced HDL and an increased susceptibility to coronary artery disease. 

Substitution of C for A at nucleotide 2394 (T774P). This variant has been 
seen at lower frequencies in populations with coronary artery disease or low HDL 
than in individuals without. 

Substitution of Cfor G at nucleotide 2402 (K776N), This variant has been 
20 found at a significantly lower frequency (0.56% vs. 2.91%, p=0.02) in a coronary 
artery disease population vs. a control population of similar Dutch background. 

Substitution of Cfor G at nucleotide 3590 (E1172D). This variant is seen at 
lower frequencies in individuals with low HDL and in some populations with 
premature coronary artery disease. 
25 Substitution of A for G at nucleotide 4384 (R1587K), This variant has been 

found at decreased frequencies in the 1/3 of individuals with the highest HDL 
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levels in our large Dutch coronary artery disease population (p=0.036), at 
increased frequencies in those with HDL cholesterol <0.9 mmol/L (p<0.0001) and 
at decreased frequencies in the cohorts with HDL cholesterol >1.4 mmol/L in both 
this population (p=0.02) and the Dutch control population (p=0.003). 

Substitution of G for C at nucleotide 5266 (SI 73 IC). Two FHA individuals 
who have this variant on the other allele have much lower HDL cholesterol 
(0.155±0.025) than the FHA individuals in the family who do not have this variant 
on the other allele (0.64±0.14, p=0.0009). This variant has also been found in one 
general population French Canadian control with HDL at the 8th percentile (0.92) 
and one French Canadian individual from a population selected for low HDL and 
coronary disease (0.72). 

Substitution of G for A at nucleotide -HIS [A('-1113)G], This variant has 
been seen at varying frequencies in populations distinguished by their HDL levels. 

Additional polymorphisms that may be associated with altered risk for 
cardiovascular disease or altered cholesterol levels are as follows: 

Substitution of G for A at nucleotide 2723 (188 3 M). This variant has been 
seen at a much higher frequency in individuals of Dutch ancestry with premature 
coronary artery disease. 

Insertion of 4 nucleotides (CCCT) at position -1181. 

Substitution of C for A at nucleotide -479 (linkage disequilibrium with 

-518). 

Substitution of G for A at nucleotide -380. 

Other Embodiments 
All publications mentioned in this specification are herein incorporated by 
reference to the same extent as if each independent publication was specifically 
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and individually indicated to be incorporated by reference. 

While the invention has been described in connection with specific 
embodiments thereof, it will be understood that it is capable of further 
modifications. This application is intended to cover any variations, uses, or 
5 adaptations following, in general, the principles of the invention and including 
such departures from the present disclosure within known or customary practice 
within the art to which the invention pertains and may be applied to the essential 
features hereinbefore set forth. 

10 What we claim is: 
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1. A substantially pure ABCl polypeptide having ABCl biological 
activity, 

2. The substantially pure ABCl polypeptide of claim 1, wherein said 
ABCl polypeptide is human ABCL 

3. The substantially pure ABCl polypeptide of claim 1, wherein said 
polypeptide comprises amino acids 1 to 60 of SEQ ID NO; L 

4. The substantially pure ABCl polypeptide of claim 1, wherein said 
polypeptide comprises amino acids 61 to 2261 of SEQ ID NO: 1. 

5. The substantially pure ABCl polypeptide of claim 1, wherein said 
polypeptide comprises amino acids 1 to 2261 of SEQ ID NO: 1. 

6. A substantially pure ABCl polypeptide comprising amino acids 1 to 60 
of SEQ ID NO: 1. 

7. A substantially pure ABCl polypeptide comprising amino acids 61 to 
2261 of SEQ ID NO: 1. 

8. A substantially pure ABCl polypeptide comprising amino acids 1 to 
2261 of SEQ ID NO: 1. 

9. A substantially pure nucleic acid molecule that hybridizes at high 
stringency conditions to nucleotides 75 to 254 of SEQ ID NO: 2 and encodes a 



-85- 



polypeptide having ABCl biological activity. 

10. A substantially pure nucleic acid molecule encoding an ABCl 
polypeptide having ABCl biological activity. 

1 1 . The substantially pure nucleic acid molecule of claim 9 or 10, wherein 
said nucleic acid molecule comprises nucleotides 75 to 254 of SEQ ID NO: 2. 

12. The substantially pure nucleic acid molecule of claim 9 or 10, wherein 
said nucleic acid molecule comprises nucleotides 255 to 6857 of SEQ ID NO: 2. 

13. The substantially pure nucleic acid molecule of claim 9 or 10, wherein 
said nucleic acid molecule comprises nucleotides 75 to 6857 of SEQ ID NO: 2. 

14. An expression vector comprising the nucleic acid molecule of claim 9. 

15. A cell expressing the nucleic acid molecule of claim 9. 

16. A non-human mammal expressing the nucleic acid molecule of claim 9. 

17. A substantially pure nucleic acid molecule comprising nucleotides 75 
to 254 of SEQ ID NO: 2. 

18. A substantially pure nucleic acid molecule comprising nucleotides 255 
to 6857 of SEQ ID NO: 2. 
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19. A substantially pure nucleic acid molecule comprising nucleotides 75 
to 6857 ofSEQ ID NO: 2. 



20. A substantially pure nucleic acid molecule comprising at least thirty 

5 consecutive nucleotides corresponding to nucleotides 701 5-7860 of SEQ ID NO: 
2. 

21 . The substantially pure nucleic acid molecule of claim 20, wherein said 
nucleic acid molecule comprises nucleotides 7015-7860 of SEQ ID NO: 2. 

10 

22. A substantially pure nucleic acid molecule that hybridizes at high 
stringency to a probe comprising nucleotides 7015-7860 of SEQ ID NO: 2. 

23. A method of treating a human having low HDL cholesterol or 

15 cardiovascular disease, said method comprising administering to said human an 
ABCl polypeptide, or cholesterol-regulating fragment thereof. 

24. The method of claim 23, wherein said ABCl polypeptide has the 
sequence of SEQ ID NO: 1. 

20 

25. The method of claim 23, wherein said ABCl polypeptide comprises a 
mutation that increases its stability. 

26. The method of claim 23, wherein said ABCl polypeptide comprises a 
25 mutation that increases its biological activity. 
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27. A method of treating a human having low HDL cholesterol or 
cardiovascular disease, said method comprising administering to said human a 
nucleic acid molecule encoding an ABCl polypeptide or a cholesterol-regulating 
fragment thereof. 

28. The method of claim 27, wherein said ABCl polypeptide has the amino 
acid sequence of SEQ ID NO: 1. 

29. The method of claim 27, wherein said ABCl polypeptide comprises a 
mutation that increases its stability. 

30. The method of claim 27, wherein said ABCl polypeptide comprises a 
mutation that increases its biological activity. 

3 1 . The method of claim 30, wherein said biological activity is regulation 
of cholesterol. 

32. The method of claim 27, wherein said human has low HDL cholesterol 
levels relative to normal. 

33. A method of increasing ABCl biological activity in a human, said 
method comprising administering to said human a nucleic acid molecule that 
hybridizes at high stringency conditions to nucleotides 75 to 254 of SEQ ID NO: 2 
and encodes a polypeptide having ABCl biological activity. 
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34. The method of claim 33, wherein said human has a disease selected 
from the group consisting of Alzheimer's disease, Niemann-Pick disease, 
Huntington's disease, x-linked adrenoleukodystrophy, and cancer. 



5 35. A method of increasing ABCl biological activity in a human, said 

method comprising administering to said human a compound that increases ABCl 
biological activity. 

36. The method of claim 35, wherein said human has a disease selected 
10 from the group consisting of Alzheimer's disease, Niemann-Pick disease, 

Huntington's disease, x-linked adrenoleukodystrophy, and cancer. 

37. A method of preventing cardiovascular disease in a human, said 
method comprising administering to said human an expression vector comprising 

15 an ABCl nucleic acid molecule operably linked to a promoter, said ABCl nucleic 
acid molecule encoding an ABCl polypeptide having ABCl biological activity. 

38. A method of preventing or ameliorating the effects of a disease- 
causing mutation in an ABCl gene in a human, said method comprising 

20 introducing into said human an expression vector comprising a promoter operably 
linked to an ABCl nucleic acid molecule encoding an ABCl polypeptide having 
ABCl biological activity. 

39. A method of treating or preventing cardiovascular disease in an 

25 animal, said method comprising administering to said animal a compound that 
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mimics the activity of wild-type ABCl. 

40. The method of claim 39, wherein said animal is a human. 

41 . A method of treating or preventing cardiovascular disease in an 
animal, said method comprising administering to said animal a compound that 
modulates the biological activity of ABCl. 

42. The method of claim 41, wherein said animal is a human. 

43. The method of claim 41, wherein said compound is selected from a 
group consisting of protein kinase A, protein kinase C, vanadate, okadaic acid, 
IBMXl, fibrates, p-estradiol, arachidonic acid derivatives, WY-14,643, LTB4, 
8(s)HETE, thiozolidinedione antidiabetic drugs, 9-HODE, 13-HODE, nicotinic 
acid, HMG CoA reductase inhibitors, and compounds that increase PPAR- 
mediated ABCl expression. 

44. The method of claim 23, 27, 39, or 41, wherein said cardiovascular 
disease is coronary artery disease, cerebrovascular disease, coronary restenosis, or 
peripheral vascular disease. 

45. A method for determining whether a candidate compound is useful for 
modulating cholesterol levels, said method comprising the steps of: 

(a) providing a chicken comprising a mutation in an ABCl gene; 

(b) administering said candidate compound to said chicken; and 
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(c) measuring ABCl biological activity in said chicken, 
wherein altered ABCl biological activity, relative to a WHAM chicken not 
contacted with said compound, indicates that said candidate compound modulates 
cholesterol levels. 

46. The method of claim 45, wherein said ABCl biological activity is 
transport of cholesterol. 

47. A method for determining whether a candidate compound modulates 
ABCl biological activity, said method comprising the steps of: 

(a) providing a cell expressing an ABCl polypeptide comprising amino 
acids 1 to 60 of SEQ ID NO: 1; 

(b) contacting said cell with said candidate compound; and 

(c) measuring ABCl biological activity of said cell, 

wherein altered ABCl biological activity, relative to a cell not contacted with said 
compound, indicates that said candidate compound modulates ABCl biological 
activity. 

48. A method for determining whether a candidate compound modulates 
ABCl expression, said method comprising the steps of: 

(a) providing a cell expressing an ABCl gene comprising nucleotides 75 to 
254 of SEQ ID NO: 2; 

(b) contacting said cell with said candidate compound; and 

(c) measuring ABCl expression of said cell, 

wherein altered ABCl expression, relative to a cell not contacted with said 
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compound, indicates that said candidate compound modulates ABCl expression. 

49. A method for determining whether a candidate compound modulates 
ABCl expression, said method comprising the steps of: 

(a) providing a nucleic acid molecule comprising an ABCl promoter 
operably linked to a reporter gene; 

(b) contacting said nucleic acid molecule with said candidate compound; 

and 

(c) measuring expression of said reporter gene, 

wherein altered reporter gene expression, relative to a control not contacted with 
said compound, indicates that said candidate compound modulates ABCl 
expression. 

50. The method of claim 49, wherein said promoter comprises 50 
consecutive nucleotides selected from nucleotides 1 to 8238 of SEQ ID NO: 14. 

5 1 . The method of claim 50, wherein said promoter comprises a binding 
site for a transcription factor selected from a group consisting of steroid response 
element binding proteins, peroxisomal proliferation-activated receptors, retinoid X 
receptors, and RAR-related orphan receptors. 

52. A method for determining whether a candidate compound modulates 
ABCl biological activity, said method comprising the steps of: 

(a) providing an ABCl polypeptide comprising amino acids 1 to 60 of SEQ 
ID NO: 1; 
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(b) contacting said polypeptide with said candidate compound; and 

(c) measuring ABCl biological activity, 

wherein a change in ABCl biological activity, relative to a control not contacted 
with said compound, indicates that said candidate compound modulates ABCl 
biological activity. 

53. A method for determining whether a candidate compound modulates 
ABCl expression, said method comprising the steps of: 

(a) providing an ABCl polypeptide comprising amino acids 1 to 60 of SEQ 
ID NO: 1; 

(b) contacting said polypeptide with said candidate compound; and 

(c) measuring expression of said ABCl polypeptide, 

wherein a change in expression of said ABCl polypeptide, relative to a control not 
contacted with said compound, indicates that said candidate compound modulates 
ABCl expression. 

54. A method for determining whether candidate compound modulates 
ABCl biological activity, said method comprising the steps of: 

(a) providing an ABCl polypeptide comprising amino acids 1 to 60 of SEQ 
ID NO: 1; 

(b) contacting said polypeptide with said candidate compound; and 

(c) measuring binding of said ABCl polypeptide to said candidate 
compound, wherein binding of said ABCl polypeptide to said compound indicates 
that said candidate compound modulates ABCl biological activity. 
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55. A method for determining whether candidate compound modulates 
ABCl biological activity, said method comprising the steps of: 

(a) providing (i) an ABCl polypeptide comprising amino acids 1 to 60 of 
SEQ ID NO: 1, and (ii) a second polypeptide that interacts with said ABCl 
polypeptide; 

(b) contacting said polypeptides with said candidate compound; and 

(c) measuring interaction of said ABCl polypeptide with said second 
polypeptide, wherein an alteration in the interaction of said ABCl polypeptide 
with said second polypeptide indicates that said candidate compound modulates 
ABCl biological activity. 

56. A method for determining whether a candidate compound increases the 
stability or decreases the regulated catabolism of an ABCl polypeptide, said 
method comprising the steps of: 

(a) providing a cell comprising an ABCl polypeptide comprising amino 
acids 1 to 60 of SEQ ID NO: 1; 

(b) contacting said cell with said candidate compound; and 

(c) measuring the half-life of said ABCl polypeptide, 

wherein an increase in said half-life, relative to a control not contacted with said 
compound, indicates that said candidate compound increases the stability or 
decreases the regulated catabolism of an ABCl polypeptide. 

57. A method for determining whether a candidate compound modulates 
ABCl biological activity, said method comprising the steps of: 

(a) providing an ABCl polypeptide in a lipid membrane; 
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(b) contacting said polypeptide with said candidate compound; and 

(c) measuring ABC 1 -mediated lipid transport across said lipid membrane, 
wherein a change in lipid transport, relative to a control not contacted with said 
compound, indicates that said candidate compound modulates ABCl biological 
activity. 

58. The method of claim 49, 52, 53, 54, 55, or 57, wherein said ABCl 
polypeptide is in a cell-free system. 

59. The method of claim 49, 52, 53, 54, 55, or 57, wherein said ABCl 
polypeptide is in a cell. 

60. The method of claim 59, wherein said cell is from a WHAM chicken. 

61 . The method of claim 59, wherein said cell is in a human or in a non- 
human mammal. 

62. The method of claim 61, wherein said animal is a WHAM chicken. 

63. The method of claim 52, wherein said biological activity is transport of 
lipid or interleukin-1. 

64. The method of claim 62, wherein said lipid is cholesterol. 

65. The method of claim 64, wherein said cholesterol is HDL-cholesterol. 
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66. The method of claim 52, wherein said biological activity is binding or 
hydrolysis of ATP by the ABCl polypeptide. 

67. A method for determining whether a patient has an increased risk for 
cardiovascular disease, said method comprising determining whether an ABCl 
gene of said patient has a mutation, wherein a mutation indicates that said patient 
has an increased risk for cardiovascular disease. 

68. A method for determining whether a patient has an increased risk for 
cardiovascular disease, said method comprising measuring ABCl biological 
activity in said patient or in a cell from said patient, wherein increased or 
decreased levels in said ABCl biological activity, relative to normal levels, 
indicates that said patient has an increased risk for cardiovascular disease. 

69. A method for determining whether a patient has an increased risk for 
cardiovascular disease, said method comprising measuring ABCl expression in 
said patient or in a cell from said patient, wherein decreased levels in said ABCl 
expression relative to normal levels, indicates that said patient has an increased 
risk for cardiovascular disease. 

70. The method of claim 69, wherein said ABCl expression is determined 
by measuring levels of ABCl polypeptide. 

71 . The method of claim 69, wherein said ABCl expression is determined 
by measuring levels of ABCl RNA. 
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72. A non-human mammal comprising a transgene comprising a nucleic 
acid molecule encoding a dominant-negative ABCl polypeptide. 

73. A cell isolated from a non-human mammal comprising a transgene 
comprising a nucleic acid molecule encoding an ABCl polypeptide having 
biological activity. 

74. A method for determining whether a candidate compound decreases the 
inhibition of a dominant-negative ABCl polypeptide, said method comprising the 
steps of: 

(a) providing a cell expressing a dominant-negative ABCl polypeptide; 

(b) contacting said cell with said candidate compound; and 

(c) measuring ABCl biological activity of said cell, 

wherein an increase in said ABCl biological activity, relative to a cell not 
contacted with said compound, indicates that said candidate compound decreases 
the inhibition of a dominant-negative ABCl polypeptide. 

75. A method for determining whether a person has an altered risk for 
developing cardiovascular disease, comprising examining the person's ABCl gene 
for polymorphisms, wherein the presence of a polymorphism associated with 
cardiovascular disease indicates the person has an altered risk for developing 
cardiovascular disease. 

76. A method for predicting a person's response to a drug, comprising 
determining whether the person has a polymorphism in an ABCl gene that alters 
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the person's response to said drug. 

77. A method for predicting a person's response to a drug, comprising 
determining whether the person has a polymorphism in an ABCl promoter that 
alters the person's response to said drug. 

78. A method for altering ABCl expression in a cell, said method 
comprising contacting said cell with a compound selected from a group consisting 
of fibrates, p-estradiol, arachidonic acid derivatives, WY- 14,643, LTB4, 
8(s)HETE, thiozohdinedione antidiabetic drugs, 9-HODE, 13-HODE, nicotinic 
acid, HMG CoA reductase inhibitors, and compounds that increase PPAR- 
mediated ABCl expression. 

79. A pharmaceutical composition comprising (i) a nucleic acid molecule 
that hybridizes under high stringency conditions to nucleotides 75 to 254 of SEQ 
ID NO: 2 and encodes a polypeptide having ABCl biological activity; and (ii) a 
pharmaceutically acceptable carrier. 

80. A nucleic acid that hybridizes under high stringency conditions to 
nucleotides 1 to 8236 of SEQ ID NO: 14. 

81 . A nucleic acid comprising a region that is 80% identical to at least 
thirty contiguous nucleotides of nucleotides 1 to 8236 of SEQ ID NO: 14. 

82. A method for determining whether candidate compound modulates 
ABCl biological activity, said method comprising the steps of: 
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(a) providing an ABCl polypeptide; 

(b) contacting said polypeptide with cholesterol and said candidate 
compound; and 

(c) measuring binding of said cholesterol to said ABCl polypeptide, 
wherein binding of said cholesterol to said ABCl polypeptide indicates that said 
candidate compound modulates ABCl biological activity. 

83. The method of claim 82, wherein said cholesterol is HDL cholesterol. 

84. The method of claim 82, wherein said method is performed in a cell 
free assay. 

85. The method of claim 82, wherein said ABCl polypeptide comprises 
amino acids 1 to 60 of SEQ ID NO: 1. 

86. the method of claim 82, wherein said cholesterol or said ABCl 
polypeptide is detectably labeled. 
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METHODS AND REAGENTS FOR MODULATING 
CHOLESTEROL LEVELS 
Abstract of the Disclosure 
The invention features ABCl nucleic acids and polypeptides for the 
diagnosis and treatment of abnormal cholesterol regulation. The invention also 
features methods for identifying compounds for modulating cholesterol levels in 
an animal (e.g., a human). 

501 10.002005 AmendedApplication.wpd 
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Exon 30 
TD-1 



4485 



4503 



4529 



wt sequence 

HUMAN.ABCl 

MOUSE.ABCl 

Patient 

CAEEL_A3C 

Patient 



aagaagatgctgcctgtgTfftcccccaggggcagggggcctgcct 

r 
U 

u 




aagaagatgctgcctgtgCfftcccccaggggcaggggggccgccc 



Fig. 4B 



Exon 13 
TD-2 



wt sequence 

HUMA-N^ABCl 

MCUSI_ABC1 

Patient 

CA£ZL_A3C 

Patient 



1842 
I 



1864 



1886 



tgggggggcttcgcctacttgcAggatgtggtggagcaggcaatc 



B Q 

e Q 

E 0 



tgggggggcttcgcctacttgcGggatgtggtggagcaggcaatc 





Fig. 5B 



Excn 14 
FHA-1 



2136 



2151 



2180 



wt sequence 

HUK."J^.ABC1 

MOUSi_AECl 

Patient 

CAZIL3BC 



agtagcctcattcctCTTcctgtgagcgctggcctgccagtggcc 

M 




SAG 
S A. G 
V S A 6 t. Ik V 



m 

I N Y A K Dl T F O V ^ E2Q T E 

agtagcctcactcct-j^-cccgtgagcgctcgcctgctagtgctc 
I 

3 bp deletion 



Fig. 6B 



E:«cn 41 
FHA-J 



wt s€"quence 
KUMAN.ABCl 

Patient (AE1893, D1894) 



^5752-5757 



5740 




5752 



5775 



gaa gatGAAGA?gtgagg cgggaaagacag 




gtgaggcgggaaagacag 



6 bp deletion 



Fig. 6E 
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SEQ ID NO: 1 



MACWPQLRLLLVvKNLTFRRRQTCQLLLEVAWPLFIFLILISWLSYPPYEQHECHFPNKAMPSAGTLPWVQ 

GIICNAM\rPCFRyPTPGEAPGWGNFNKSIVARLFSDARRLLLYSQKDTSMKDMRK\'LRTLQQIKKSSSNL 

KLODFLVDNETFSGFLYHNLSLPKSTVDKMLRADVILHKVFLQGYQLHLTSLCNGSKSEEMIQLGDQEVSE 

LCGLPREKLAAAERVLRSNMDILKPILRTLNSTSPFPSKELAEATKTLLHSLGTLAQELFSMRSWSDMRQE 

VMFLTNVNSSSSSTQIYQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKALFGGNGTEEDAETFYDNSTTPYC 

NDLMKNLE S S PI. SRI I WKALKPLLVGKI L YTPDTPATRQ\mAE WKTFQELAVFHDLSGMWEELS PKI WTF 

MENSQEMDLVR>ILLDSRDNDHFWEQQLDGLDWTAQDIVAFLAKHPEDVQSSNGSVYrvN^REAFNETNQAIRT 

ISRFMECWLKKLEPIATEWLINKSMELLDERKFWAGIVFTGITPGSIELPHHVKYKIRiyD 

IKDGYWDPGPRADPFEDMRYWGGFAYLQDVVEQAIIRVLTGTEKKTGVYMQQMPYPCYVDDIFLRVMSRS 

MPLFMTLAWIY3VAVIIKGIWEKEARLKETMRIMGLDNSILWFSWFISSLIPLLV3AGLLVVILKLGNLL 

PYSDPSWFVFLSVFAWTILQCFLISTLFSRANLAAACGGIIYFTLYLPYVLCVAWQDYVGFTLKIFASL 

LSPVAFGFGCEYFALFEEQGIGVQWDNLFESPVEEDGFNLTTSVSMMLFDTFLYGVNTWYIEAVFPGQYGI 

PRPWYFPCTKSYWFGEESDEKSHPGSNQKRISEICMEEEPTHLKLGVSIQNLVKAmiDGMK^AVDGLAL^ 

YEGQITSFLGKNGAGKTTTMSILTGLFPPTSGTAYILGKDIRSEMSTIRQNLGVCPQHNVLFDMLTVEEHI 

WFYARLKGLSEKHVKAEMEQMALDVGLPSSK1;KSKTSQLSGGMQRKLSVALAFVGGSKWILDEPTAGV^ 

YSRRGIWELLLKYRQGRTIILSTHHMDEADVLGDRIAIISHGKLCCVGSSLFLKNQLGTGYYLTLVKKDVE 

SSLSSCRNSSSTVSYLKKEDSVSQSSSDAGLGSDHESDTLTIDVSAISNLIRKHVSEARLVEDIGHELTYV 

LPYEAAKEGAF\?*ELFHEIDDRLSDLGISSYGISETTLEEIFLKVAEESGVDAETSDGTLPARRNRRAFGDK 

QSCLRPFTEDDAADPNDSDIDPESRETDLLSGMDGKGSYQVKGWKLTQQQFVALLWKRLLIARRSRKGFFA 

QIVLPAVFVCIALVFSLIVPPFGKYPSLELQPWMYNEQYTFVSNDAPEDTGTLELLNALTKDPGFGTRCME 

GNPIPDTPCQAGEEEWTTAPVPQTIMDLFQNGNWTMQNPSPACQCSSDKIKKMLPVCPPGAGGLPPPQRKQ 

NTADILQDLTGRNISDYLVKTYVQIIAKSLKNKIWWEFRYGGFSLGVSNTQALPPSQEVlSrDAIKQMKK^ 

KLAKDSSADRFLNSLGRFMTGLDTRNmTKVWFIOTKGWHAISSFLNVINNAILR^ 

HPLNLTKQQLSEVALMTTSVDVLVSICVIFAMSFVPASFWFLIQERVSKAKHLQFISGVKPVIYWLSNFV 

WDMCNYWPATLVIIIFICFQQKSYVSSTNLPVLALLLLLYGWSITPLMYPASFVFKIPSTAYAATLTSVNL 

FIGINGSVATFVLELFTDNKLNNIiroiLKSVFLIFPHFCLGRGLIDMVKNQAMADALERFGENRFVSPLSW 

DLVGRJTLFAMAVEGWFFLITATLIQYRFFIRPRPVNAKLSPLNDEDEDVRRERQRILDGGGQNDILEIKEL 

TKIYRRKRKPAVDRICVGIPPGECFGLLGVNGAGKSSTFKMLTGDTTVTRGDAFLNKNSILSNIHEVHQNM 

GYCPQFDAITELLTGREHVEFFALLRGVPEKEVGKVGEWAIRKLGLVKYGEKYAGNYSGGNKRKLSTAMAL 

IGGPPWFLDEPTTGMDPKARRFLWNCALSV\TCEGRSWLTSHSMEECEALCTRMAIM^ 

LKNRFGDGYTIWRIAGSNPDLKPVQDFFGIAFPGSVLKEKHRNMLQYQLPSSLSSIJ^RIFSILSQSKKRL 

HIEDYSVSQTTLDQVFVNFAKDQSDDDHLKDLSLHKNQTWDVAVLTSFLQDEKVKESYV* 
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SEQ ID NO: 2 

GTCCCTGCTGTGAGCTCTGGCCGCTGCCTTCCAGGGCTCCCGAGCCACACGCTGGGGGTG 

CTGGCTGAGGGAACATGGCTTGTTGGCCTCAGCTGAGGTTGCTGCTGTGGAAGAACCTCA 

CTTTCAGAAGAAGACAAACATGTCAGCTGTTACTGGAAGTGGCCTGGCCTCTATTTATCT 

TCCTGATCCTGATCTCTGTTCGGCTGAGCTACCCACCCTATGAACAACATGAATGCCATT 

TTCCAAATAAAGCCATGCCCTCTGCAGGAACACTTCCTTGGGTTCAGGGGATTATCTGTA 

ATGCCAACAACCCCTGTTTCCGTTACCCGACTCCTGGGGAGGCTCCCGGAGTTGTTGGAA 

ACTTTAACAAATCCATTGTGGCTCGCCTGTTCTCAGATGCTCGGAGGCTTCTTTTATACA 

GCCAGAAAGACACCAGCATGAAGGACATGCGCAAAGTTCTGAGAACATTACAGCAGATCA 

AGAAATCCAGCTCAAACTTGAAGCTTCAAGATTTCCTGGTGGACAATGAAACCTTCTCTG 

GGTTCCTGTATCACAACCTCTCTCTCCCAAAGTCTACTGTGGACAAGATGCTGAGGGCTG 

ATGTCATTCTCCACAAGGTATTTTTGCAAGGCTACCAGTTACATTTGACAAGTCTGTGCA 

ATGGATCAAAATCAGAAGAGATGATTCAACTTGGTGACCAAGAAGTTTCTGAGCTTTGTG 

GCCTACCAAGGGAGAAACTGGCTGCAGCAGAGCGAGTACTTCGTTCCAACATGGACATCC 

TGAAGCCAATCCTGAGAACACTAAACTCTACATCTCCCTTCCCGAGCAAGGAGCTGGCTG 

AAGCCACAAAAACATTGCTGCATAGTCTTGGGACTCTGGCCCAGGAGCTGTTCAGCATGA 

GAAGCTGGAGTGACATGCGACAGGAGGTGATGTTTCTGACCAATGTGAACAGCTCCAGCT 

CCTCCACCCAAATCTACCAGGCTGTGTCTCGTATTGTCTGCGGGCATCCCGAGGGAGGGG 

GGCTGAAGATCAAGTCTCTCAACTGGTATGAGGACAACAACTACAAAGCCCTCTTTGGAG 

GCAATGGCACTGAGGA21GATGCTGAAACCTTCTATGACAACTCTACAACTCCTTACTGCA 

ATGATTTGATGAAGAATTTGGAGTCTAGTCCTCTTTCCCGCATTATCTGGAAAGCTCTGA 

AGCCGCTGCTCGTTGGGAAGATCCTGTATACACCTGACACTCCAGCCACAAGGCAGGTCA 

TGGCTGAGGTGAACAAGACCTTCCAGGAACTGGCTGTGTTCCATGATCTGGAAGGCATGT 

GGGAGGAACTCAGCCCCAAGATCTGGACCTTCATGGAGAACAGCCAAGAAATGGACCTTG 

TCCGGATGCTGTTGGACAGCAGGGACAATGACCACTTTTGGGAACAGCAGTTGGATGGCT 

TAGATTGGACAGCCCAAGACATCGTGGCGTTTTTGGCCAAGCACCCAGAGGATGTCCAGT 

CCAGTAATGGTTCTGTGTACACCTGGAGAGAAGCTTTCAACGAGACTAACCAGGCAATCC 

GGACCATATCTCGCTTCATGGAGTGTGTCAACCTGAACAAGCTAGAACCCATAGCAACAG 

AAGTCTGGCTCATCAACAAGTCCATGGAGCTGCTGGATGAGAGGAAGTTCTGGGCTGGTA 

TTGTGTTCACTGGAATTACTCCAGGCAGCATTGAGCTGCCCCATCATGTCAAGTACAAGA 

TCCGAATGGACATTa^CAATGTGGAGAGGACAAATAAAATCAAGGATGGGTACTGGGACC 

CTGGTCCTCGAGCTGACCCCTTTGAGGACATGCGGTACGTCTGGGGGGGCTTCGCCTACT 

TGCAGGATGTGGTGGAGCAGGCAATCATCAGGGTGCTGACGGGCACCGAGAAGAAAACTG 
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GTGTCTATATGCAACAGATGCCCTATCCCTGTTACGTTGATGACATCTTTCTGCGGGTGA 

TGAGCCGGTCAATGCCCCTCTTCATGACGCTGGCCTGGATTTACTCAGTGGCTGTGATCA 

TCAAGGGCATCGTGTATGAGAAGGAGGCACGGCTGAAAGAGACCATGCGGATCATGGGCC 

TGGACAACAGCATCCTCTGGTTTAGCTGGTTCATTAGTAGCCTCATTCCTCTTCTTGTGA 

GCGCTGGCCTGCTAGTGGTCATCCTGAAGTTAGGAAACCTGCTGCCCTACAGTGATCCCA 

GCGTGGTGTTTGTCTTCCTGTCCGTGTTTGCTGTGGTGACAATCCTGCAGTGCTTCCTGA 

TTAGCACACTCTTCTCCAGAGCCAACCTGGCAGCAGCCTGTGGGGGCATCATCTACTTCA 

CGCTGTACCTGCCCTACGTCCTGTGTGTGGCATGGCAGGACTACGTGGGCTTCACACTCA 

AGATCTTCGCTAGCCTGCTGTCTCCTGTGGCTTTTGGGTTTGGCTGTGAGTACTTTGCCC 

TTTTTGAGGAGCAGGGCATTGGAGTGCAGTGGGACAACCTGTTTGAGAGTCCTGTGGAGG 

AAGATGGCTTCAATCTCACCACTTCGGTCTCCATGATGCTGTTTGACACCTTCCTCTATG 

GGGTGATGACCTGGTACATTGAGGCTGTCTTTCCAGGCCAGTACGGAATTCCCAGGCCCT 

GGTATTTTCCTTGCACCAAGTCCTACTGGTTTGGCGAGGAAAGTGATGAGAAGAGCCACC 

CTGGTTCCAACCAGAAGAGAATATCAGAAATCTGCATGGAGGAGGAACCCACCCACTTGA 

AGCTGGGCGTGTCCATTCAGAACCTGGTAAAAGTCTACCGAGATGGGATGAAGGTGGCTG 

TCGATGGCCTGGCACTGAATTTTTATGAGGGCCAGATCACCTCCTTCCTGGGCCACAATG 

GAGCGGGGAAGACGACCACCATGTCAATCCTGACCGGGTTGTTCCCCCCGACCTCGGGCA 

CCGCCTACATCCTGGGAAAAGACATTCGCTCTGAGATGAGCACCATCCGGCAGAACCTGG 

GGGTCTGTCCCCAGCATAACGTGCTGTTTGACATGCTGACTGTCGAAGAACACATCTGGT 

TCTATGCCCGCTTGAAAGGGCTCTCTGAGAAGCACGTGAAGGCGGAGATGGAGCAGATGG 

CCCTGGATGTTGGTTTGCCATCAAGCAAGCTGAAAAGCAAAACAAGCCAGCTGTCAGGTG 

GAATGCAGAGAAAGCTATCTGTGGCCTTGGCCTTTGTCGGGGGATCTAAGGTTGTCATTC 

TGGATGAACCCACAGCTGGTGTGGACCCTTACTCCCGCAGGGGAATATGGGAGCTGCTGC 

TGAAATACCGACAAGGCCGCACCATTATTCTCTCTACACACCACATGGATGAAGCGGACG 

TCCTGGGGGACAGGATTGCCATCATCTCCCATGGGAAGCTGTGCTGTGTGGGCTCCTCCC 

TGTTTCTGAAGAACCAGCTGGGAACAGGCTACTACCTGACCTTGGTCAAGAAAGATGTGG 

AATCCTCCCTCAGTTCCTGaAGAAAOVGTAGTAGCACTGTGTCATACCTGAAAAAGGAGG 

ACAGTGTTTCTCAGAGCAGTTCTGATGCTGGCCTGGGCAGCGACCATGAGAGTGACACGC 

TGACCATCGATGTCTCTGCTATCTCCAACCTCATCAGGAAGCATGTGTCTGAAGCCCGGC 

TGGTGGAAGACATAGGGCATGAGCTGACCTATGTGCTGCCATATGAAGCTGCTAAGGAGG 

GAGCCTTTGTGGAACTCTTTCATGAGATTGATGACCGGCTCTCAGACCTGGGCATTTCTA 

GTTATGGCATCTCAGAGACGACCCTGGAAGAAATATTCCTCAAGGTGGCCGAAGAGAGTG 

GGGTGGATGCTGAGACCTCAGATGGTACCTTGCCAGCAAGACGAAACAGGCGGGCCTTCG 

GGGACAAGCAGAGCTGTCTTCGCCCGTTCACTGAAGATGATGCTGCTGATCCAAATGATT 
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CTGAC^TAGACCCAGAATCCAGAGAGACAGACTTGCTCAGTGGGATGGATGGCAAAGGGT 

CCTACCAGGTGAAAGGCTGGAAACTTACACAGCAACAGTTTGTGGCCCTTTTGTGGAAGA 

GACTGCTAATTGCCAGACGGAGTCGGAAAGGATTTTTTGCTCAGATTGTCTTGCCAGCTG 

TGTTTGTCTGCATTGCCCTTGTGTTCAGCCTGATCGTGCCACCCTTTGGCAAGTACCCCA 

GCCTGGAACTTCAGCCCTGGATGTACAACGAACAGTACACATTTGTCAGCAATGATGCTC 

CTGAGGACACGGGAACCCTGGAACTCTTAAACGCCCTCACCAAAGACCCTGGCTTCGGGA 

CCCGCTGTATGGAAGGAAACCCAATCCCAGACACGCCCTGCCAGGCAGGGGAGGAAGAGT 

GGACCACTGCCCCAGTTCCCCAGACCATCATGGACCTCTTCCAGAATGGGAACTGGACAA 

TGCAGAACCCTTCACCTGCATGCCAGTGTAGCAGCGACAAAATCAAGAAGATGCTGCCTG 

TGTGTCCCCCAGGGGCAGGGGGGCTGCCTCCTCCACAAAGAAAACAAAACACTGCAGATA 

TCCTTaAGGACCTGACAGGAAGAAACATTTCGGATTATCTGGTGAAGACGTATGTGCAGA 

TCATAGCCAAAAGCTTAAAGAACAAGATCTGGGTGAATGAGTTTAGGTATGGCGGCTTTT 

CCCTGGGTGTCAGTAATACTCAAGCACTTCCTCCGAGTCAAGAAGTTAATGATGCCATCA 

AACAAATGAAGAAACACCTAAAGCTGGCCAAGGACAGTTCTGCAGATCGATTTCTCAACA 

GCTTGGGAAGATTTATGACAGGACTGGACACCAGAAATAATGTCAAGGTGTGGTTCAATA 

ACAAGGGCTGGCATGCAATCAGCTCTTTCCTGAATGTCATCAACAATGCCATTCTCCGGG 

CCAACCTGCAAAAGGGAGAGAACCCTAGCCATTATGGAATTACTGCTTTCAATCATCCCC 

TGAATCTCACCAAGCAGCAGCTCTCAGAGGTGGCTCTGATGACCACATCAGTGGATGTCC 

TTGTGTCCATCTGTGTCATCTTTGCAATGTCCTTCGTCCCAGCCAGCTTTGTCGTATTCC 

TGATCCAGGAGCGGGTCAGCAAAGCAAAACACCTGCAGTTCATCAGTGGAGTGAAGCCTG 

TCATCTACTGGCTCTCTAATTTTGTCTGGGATATGTGCAATTACGTTGTCCCTGCCACAC 

TGGTCATTATCATCTTCATCTGCTTCCAGCAGAAGTCCTATGTGTCCTCCACCAATCTGC 

CTGTGCTAGCCCTTCTACTTTTGCTGTATGGGTGGTCAATCACACCTCTCATGTACCCAG 

CCTCCTTTGTGTTCAAGATCCCCAGCACAGCCTATGTGGTGCTCACCAGCGTGAACCTCT 

TCATTGGCATTAATGGCAGCGTGGCCACCTTTGTGCTGGAGCTGTTCACCGACAATAAGC 

TGAATAATATCAATGATATCCTGAAGTCCGTGTTCTTGATCTTCCCACATTTTTGCCTGG 

GACGAGGGCTCATCGACATGGTGAAAAACCAGGCAATGGCTGATGCCCTGGAAAGGTTTG 

GGGAGAATCGCTTTGTGTCACCATTATCTTGGGACTTGGTGGGACGAAACCTCTTCGCCA 

TGGCCGTGGAAGGGGTGGTGTTCTTCCTCATTACTGTTCTGATCCAGTACAGATTCTTCA 

TCAGGCCCAGACCTGTAAATGCAAAGCTATCTCCTCTGAATGATGAAGATGAAGATGTGA 

GGCGGGAAAGACAGAGAATTCTTGATGGTGGAGGCCAGAATGACATCTTAGAAATCAAGG 

AGTTGACGAAGATATATAGAAGGAAGCGGAAGCCTGCTGTTGACAGGATTTGCGTGGGCA 

TTCCTCCTGGTGAGTGCTTTGGGCTCCTGGGAGTTAATGGGGCTGGAAAATCATCAACTT 

TCAAGATGTTAACAGa^VGATACCACTGTTACCAGAGGAGATGCTTTCCTTAACAAAAATA 
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GTATCXrATCAAACATCCATGAAGTACATCAGAACATGGGCTACTGCCCTCAGTTTGATG 

CCATCACAGAGCTGTTGACTGGGAGAGAACACGTGGAGTTCTTTGCCCTTTrGAGAGGAG 

TCCCAGAGAAAGAAGTTGGCAAGGTTGGTGAGTGGGCGATTCGGAAACTGGGGCTCGTGA 

AGTATGGAGAAAAATATGCTGGTAACTATAGTGGAGGCAACAAACGCAAGCrCTCTACAG 

CCATGGCTTTGATCGGCGGGCCTCCTGTGGTGTTTCTGGATGAACCCACCACAGGCATGG 

ATCCCAAAGCCCGGCGGTTCTTGTGGAATTGTGCCCTAAGTGTTGTCAAGGAGGGGAGAT 

CAGTAGTGCTTACATCTCATAGTATGGAAGAATGTGAAGCTCTTTGCACTAGGATGGCAA 

TCATGGTCAATGGAAGGTTCAGGTGCCTTGGCAGTGTCCAGCATCTAAAAAATAGGTTTG 

GAGATGGTTATACAATAGTTGTACGAATAGCAGGGTCCAACCCGGACCTGAAGCCTGTCC 

AGGATTTCTTTGGACTTGCATTTCCTGGAAGTGTTCTAAAAGAGAAACACCGGAACATGC 

TACAATACCAGCTTCCATCTTCATTATCTTCTCTGGCCAGGATATTCAGCATCCTCTCCC 

AGAGCAAAAAGCGACTCCACATAGAAGACTACTCTGTTTCTCAGACAACACTTGACCAAG 

TATTTGTGAACTTTGCCAAGGACCAAAGTGATGATGACCACTTAAAAGACCTCTCATTAC 

ACAAAAACCAGACAGTAGTGGACGTTGCAGTTCTCACATCTTTTCTACAGGATGAGAAAG 

TGAAAGAAAGCTATGTATGAAGAATCCTGTTCATACGGGGTGGCTGAAAGTAAAGAGGAA 

CTAGACTTTCCTTTGCACCATGTGAAGTGTTGTGGAGAAAAGAGCCAGAAGTTGATGTGG 

GAAGAAGTAAACTGGATACTGTACTGATACTATTCAATGCAATGCAATTCAATGCAATGA 

AAACAAAATTCCATTACAGGGGCAGTGCCTTTGTAGCCTATGTCTTGTATGGCTCTCAAG 

TGAAAGACTTGAATTTAGTTTTTTACCTATACCTATGTGAAACTCTATTATGGAACCCAA 

TGGACATATGGGTTTGAACTCACACTTTTTTTTTTTTTTTTGTTCCTGTGTATTCTCATT 

GGGGTTGCAACAATAATTCATCAAGTAATCATGGCCAGCGATTATTGATCAAAATCAAAA 

GGTAATGCACATCCTCATTCACTAAGCCATGCCATGCCCAGGAGACTGGTTTCCCGGTGA 

CACATCCATTGCTGGCAATGAGTGTGCCAGAGTTATTAGTGCCAAGTTTTTCAGAAAGTT 

TGAAGCACCATGGTGTGTCATGCTCACTTTTGTGAAAGCTGCTCTGCTCAGAGTCTATCA 

ACATTGAATATCAGTTGACAGAATGGTGCCATGCGTGGCTAACATCCTGCTTTGATTCCC 

TCTGATAAGCTGTTCTGGTGGCAGTAA.CATGCAACAAAAATGTGGGTGTCTCCAGGCACG 

GGAAACTTGGTTCCATTGTTATATTGTCCTATGCTTCGAGCCATGGGTCTACAGGGTCAT 

CCTTATGAGACTCTTAAATATACTTAGATCCTGGTAAGAGGCAAAGAATCAACAGCCAAA 

CTGCTGGGGCTGCAACTGCTGAAGCCAGGGCATGGGATTAAAGAGATTGTGCGTTCAAAC 

CTAGGGAAGCCTGTGCCCATTTGTCCTGACTGTCTGCTAACATGGTACACTGCATCTCAA 

GATGTTTATCTGACACAAGTGTATTATTTCTGGCTTTTTGAATTAATCTAGAAAATGAAA 
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prrgry in public aouency fdfHtfncag betvwn aH samotes and Cnbani^ AJO^g^jB^i^; 



, ExofVimron Nucleotide* Amino acid 



S«qu«nc» dlft«r»nc«/eottf xt 



change 



T1SQC irw change 



'ubiic sequence 



TGTCAGCTGTTACTOGAAGTt^J 



irrect sequence: 



TGTCAGCTGCTGCTGGAAGTTSG 



C839T |no Change 



jPubiic sequence. 



'AGGAGCTGGCCGAAGCCACAA 



jCorrect sequence. jAGGAGCTGGCTGAAGCCftCAA 



public sequence: 



lATGATGCCaCCAAACAAATJ 



jConBct sequence. 



lATGATGCCRTCAAACAAATG 



'ubiic sequence: 



GTGGCTCCGATGACCACV 



irect sequence: 



GTGGCTCTGATGACCACA 



ublic sequence- 



'CCTtaacacaaatagtat: 



ttrect sequence* 



'CCTTAACAAAAATAGTATC 



ibflc sequence. 



AGTGTTCCAAAAGAGAAA 



a sequence. 



AGTGTTCTAAAAGAGAAA 



not applicable 



iPubltc sequence- 



rrAAAGAGGGACTAGACrrr 



jCorrect sequence' 



jAGTAAAGAGGAACTAGACnT 



Igcctj 



More common 



"acttgcaggatgtggt: 



[Less common 



(gcctacttgcgggatgtggt: 



detta CTT 2151-3 



ICCTCATTCCTCTTCTTGTGAG C 



iccTCArrccr.' ci i ^i ' Q agc^ 



XGGACTACGTGGGCITCAC 



L!\GGACTACATGGGCTrCAC 



iMore common* 



AGTCTACCGAGATGGGA7 



AGTCTACTGAGATGGGA7 



IGCCAGATCACCTCCTTCCTG 



GCCAGATCATCTCCrrCCrC 



[More common 



1ACACACCA.CATGGATGAAGCG 



tACACACCACACGGATGAAGCG 



Inron 24 (+1 ) G to C lAftered transcnpt jMore cormwn 



jCCTGGAAGAAGTAAGTTAAGT 



splice donor site jlength 



[Less common 



CAAGAACTAAGTTAAGT 




More common 



1GCCTGTGCG7CCCCCAGG 



GG 4956-57 to C 



frAGCCATTATGGAATrACTGCT 



fTAGCCATTATCAATTACTGCT 



41 delta AAGATG 5752- 



i(E.D)t893-1894 [More common 



LftTGAAGATGAAGATGTGAGGCGGGA 



Less common 



U^TGAAGATG/TGAGGCGGGA 



lATAGTTGTACGAATAGCAGG 



ATAGTTGTATGAATAGCAGG 



Promoter Vanants: 



Location iPosrtion 

jReiatlve to 
Ixenon cDNA 



position 
Retativa to 
SEQIDNO: 14 
Containing 
lExon 1 



I 





G57C 1 


3216 


IMore common 


Iacacgctgggggtgctqgctg 


204 1 








jless common 


[ACACGCrGGGCGTGCTGGCrG 


205 \ 


i ! ! 


■ 


H 4 ms G 


3158 


iMore common 


pCCRGCCACGGCGTCCCTG 


206 i 








[Less common 


KACCAGCCACGGGCGTCCCTG 


207 ! 


111 ■ ' 


5 


A (-) 380 G 


TSO 


[More common 


jCATTTTCTTAGAAAAGAGAGGT 


208 







jLess common 


jcATTTTCTTAGAGAAGAGAGGT 


209 








A H 479 C 


'581 


^ore common 


pAAAATTAGTAIGTAAGGAAG 


210 






iLess common 


iGAAAATTAGTCrGTAAGGAAG 


211 


i 


i t 


1 =■ 


A (-) 738 G 


-422 


iMore common 


ICCrCCGCCTGCCAGGTTCAGCGATT 


212 


1 ~i 




jLess common- 


|cCrCCGCCTGCCGGGTTCAGCGATT 


213 ) 






5- 


1 A (-) 1045 G 


-115 


iMore common 


[TATGTGCrGACCATGGGAGCrrGTT 


214 






jLess common 


fTATGTGCTGACCGTGGGAGCTTGTT 


215 






5 


A{-)1113G 


-047 


iMore common 


iGTGACACCCAACGGAGTAGGG 


216 






\uess common 


'rGTGACACCCAGCGGAGTAGGG 


217 






5- 


(-) 1181 ms CCCT 


^79 


jMore common 


(AGTATCCCT/ TGTTCACGAGAA 


218 



[Less common 



'GTTCACGAGAA 
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PotYrnorphlsmi: 


1 




' Exon/lntron 


Muclflotidwf 


Amino add 
changa 




S£<3 ID NO: 








5 


G548A 


no change 


More comnxsn 


CTGGGTTCCTGTATCACAAC-C 








Less common. 


PTGGGTTCCrATATCACAACr 


221 








6 


G730A 


R219K 


More common. 


|gGCCTACCAAGGGAGAAAC73 


222 






Less common 


pGCCTACCAAACXIAGAAACrS 


223 




1 1 




Intron 7 


G {+) 2383 T 


Not applicable 


iAIIete 1- 


ITTTAAAGGGGGTGATTAGCIA 


224 






kiete 2 


jnTAAAGGGGTTGATTAGGA 


225 




1 1 


intron 7 


G (+) 3035 T 


Not appicable 


[Allele 1 


(GAACyuUVTITGITn':' i'-Lv^ . 


226 








Allele 2 


bAAGAAArnTfiTi'i i : 


227 






8 


C1010T ino change 


iMore common* 


|GCGGGCaTCCCGAGGGAGG?3 


228 








iLess common 


bcGGGCATCCTGAGGGAGG?^ 


229 


! i 1 




\ 3 


G1022A ino change 


More common 


fAGGGAGGGGGGCTGAAGAr:!^ 


230 


t 




Less common 


kcGGAGGGGGACTGAAGATC?; 


231 








ir^ron 9 


H42tns G 


Not applicable 


|More common 


jAGGAGCCAAACGCrCArrGT 


232 






ILess common- 


bvGGAGCCRAAGCGCTCArrGT 


233 








tntron 13 


T (+) 24 A 


Not applicable 


[More common 


UAGCCACTGTrnTAACC^CT 


234 






jLess common. 


Uagccactgtatttaaccajt 


235 








15 


A2394C 


T774P 


jMore common. 


iOGTGGGCrrCACACrCAAG;.r 


236 






[Less common 


CGTGGGCTrCCCACTCAAG.iT 


237 








15 


G2402C 


K776N 


|More common 


hrCACACTCAAGATCTTCGCrG 


238 






ILess common- 


Itcacactcaacatcttcgctg 


239 




i i : ; 


Intron 14 


C(+) 16 T 


Not applicable 


iAIiele 1 


pCAGCCTCACCCGCTCrrC^C 


240 






Uiiele 2 


IGCAGCCTCACTCGCTCTTCrj 


241 




1 1 


17 


A2723G 


I883M 


{Allele 1 


IftGAAGAGAATATCAGAAATrr 


242 






lAllele 2 


bvGAAGAGAATGTCAGAAAT~ 


243 


1 


1 { : 


Intron 17 


C (+) 20D0 G 


Not apohcabie 


{Allele 1 


pCGCAGTGCCCTGTGTCCTrri 


244 






lAllele 2 


iGCGCAGTGCGCTGTGTCv,* .A 


245 




1 ! 


21 


T3233G 


no change 


[More comnwn 


iGATCTAAGGrrGTCATTC?C-G 


246 






jLess common 


iGATCTAAGGTGGTCATTCrCO 


247 




! 1 


fntTOH 21 


G 11ST 


Not aDO«:able 


lAIIele 1 


jCTCTTCTGTTAGCACAGAACAGA 


248 






lAtlele 2 


CTCTTCnSTTATCACAGAACAGA 


249 






Intron 21 


A <+) 563 G 


Not apoiicable 


iAIIete 1- 


CATTCTAGGGATCATAGCCAT 


250 






lAlleie 2 


IcATTCTAGGGGTCATAGCCAT 


251 



G {*) 321 T 



GTACAGTGGGAGGAACA3; 



GTACAGTGTGAGGAACAGCG 



A {-) 624 G 



Not apoticable 



lATTCCTAAAAAATAGAAATGCA 



iTTCCTAAAAAGTAGAAATCCA 



T {+) 30 C 



Not appticabte |Mofe «jmmon 



pGi 



CCCCrGCCTTATTATTACT 



jLess common 



GCCCCTGCCGTATTATTACT 



A {+) 732 G 



Not apohcable 



Jlele 1 
iAIIele 2- 



iTGAGAGAATTACrTGAACCCGG 



y^GAGAATTGCrrGAACCCGG 



C (+) 898 T INot acoacable Allele 1 



ITTTGCTGAAACAATCACTC^C 



|TTTG( 



ICTGAAATAATCACTGAC 



C (^^) 234 T iNot aoQttcabte ~ Allele 1 



iAA( 
kA( 



.CCTCAGTTCCCTCATCTGTG 



Allele 2 



.CCTCAGrrTCCTCATCTGTG 



More common 



(CTGGACACC&GAAATAATGTC 



yVCACCAAAAATAATGTC 



C5266G 



ITCCTATGTGTCCTCCACCAAT 



'CCTATGTGTCCTCCACCAAT 



T (+) 18 C 'Not aooBcabie 



More TOmmon 



J\GAAGTGGCTTGTATnTGC 



jAA( 



iGAAGTGGCCTGTATITTGC 



A (->•) 1665 G [Not scoucabie 



lAACTGATTTGATTGGTATAGCTC 



|AACTGATTTGGTrGGT=.*r.=LGCTC 



C6521T no CfA'^ge 



More cxjmmon 



jCAGGGTCCAACCCGGACCTGA 



Less common 



iCAGGGTCCAATCCGGACCTGA 



{•*-) 14 ins T Not acoucable 



More common 



KTCAGGGATGGGGAC; 



Less common 



iCGTCAGGGATTGGGGACAG 



Exon 16 


G2547A 1 V825I 


More common 


jcxiACTrCGGTCTCCATG 


; 286 






Less common 


CCACTTCGATCTCCATG 


287 






! 


Polymorphisn 


1 in an ABC1 BAG cont.o: 




:This Dolvmorph 


ism ts wthin approximatetv roo <& of ttie ABCt gene 




! SEQ ID NO: 




! 1 i 1 




A or G 'not acoicable 


Allele 1 


proXJAGGCTAAGGCAGGAGAA 


i 274 






Allele 2 


ITTXXjGWJGCTGAGGCAGGAGAA 


1 275 
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SEQ ID NO: 14 



Genomic contig containing ABCl exon 1 : 
Underline = putitive promoter element 



acctcttatagaatgatagaattcctctggaatgattggataacttcatttcatccttgacttttaccttggaggattt 
cttaccccttttggcttctcaaatttgactattaaaatgttgcctttaaaaataggaacacagtttcaggggggagtac 
cagcccatgacccttctgcaaggccccctaactcaaggtagtttccctggaactgtggtttatcgaatgtttcaggagt 
gtgaggacgtataatttaaggctgtcctagcaaggatacccttaaggatagagggcccagtagcatctggaggccagaa 
aagttaaactgaggcagtcagattagcttcaggctcaattaagctgatgggtcagcctgggagaaattgcaggatgact 
ctcaatatcccctcccacccccacagcagccacgatctgtctgtctttaatcatgggtgcagtgaacctgttctttcca 
ggtgtcttggccttcagtaaccttgttaggcttgtccctgaacgtggctaccgatccaaagacacatgatcagagaggc 
aattagacaacagaccttttccaaagcaagcatgttctgttgggcttagaagtttcatgtcctaatattataggaccct 
gtgcatctctctggagatgaggcacatgagtcatatctgtgattcttgcttttgtgtcaacatctcatgaataggcaat 
cagagctttggcaccaatgtattttcagttcatatctgatgtagttaaatccacctcctgctttgtagtttactggcaa 
gctgtttttgatataagacatctagaacactgtaaatatataacatttttatttgtctattatacctcaattacgaaaa 
agacatctagaagcaacctcatcaagagagatactgaggccgggcatggtagctcacacttgcaatcccattactttgg 
gaggctgaggcaggtagatcacttgaggtcaagagtttgaaaccagcctggccaacatgttgaaaccctgtctctatta 
aaaatacaaaaaagttagctgggcttggtggtgggcacctgtaatcccagctactccggaggctgaggcaggagaatca 
cttgaacctgggaggcagaggttgcagtgagctgagatcacaccactgcactccaacctgggcaccagagtgagattac 
atctaaaaaataaaataaagtaataaaaaagagagatattgatagctgttgttggaaatttcaacttccatctcacttc 
tggtaactttttggaagtttgttgaacaaagtggaatacacgcacatacacacacacacatactctcttgtttgtttaa 
ggtttaatgaaatagctgtcatataatcactgtttttgaaagaggagaattagttgctatctgtacattttgggtatgt 
gaactatttggatagaactctgagaaatgcattcagaacaacaaacaaaatcataggagaaatagctaagtgggaaggg 
gcatataagagttgttgaaaaagttatttcttgagaaaccagctctaatgctaggcaagtcacttgctttgggggaggc 
ctcagcttctctgtctataagattgcagcaggggtgtagtgggaatgagtcttcaacattccaagagattttatctact 
aatacgacagtcaaatggagcatgactttgtggaagcctctcctcttccacccagaggggccaatttctctgtcccagt 
gagatgttgacacttgtatgatccctgcttggagacttccctcttctggaacctgccctggctcaggcatgagggctga 
ctgtcacccttcgataggagcccagcactaaagctcatgtgttggcagtgttcttgcgggaaggaaaaagaccagccag 
cccatttgttactgcacaagcaaacagcttctggtagctgtacagatacatgcactttctttcctcactgtgtttccat 
agacagatttagtgctgtagaagagtagagggcagtcacgggaaggagttcctgtttttcttttggctatgccaaatgg 
ggaaaaatcctcctatcttgtctttttagtgtcatcctctctccccttttcttcttctttataattctcatctctcatc 
tctcctggaaatgtgcatgtcaagttcaaaagggcacaatgttttggtgaggaagaggtgggagaacacgtgccaggtg 
ctaactagggtcatcatttcccccttcacagccagcttcctgtgaatgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgt 
gtgtgtgtgtgtgtatttcttttgccagcatcactgaatctgtctgctgtctggtattccaggttttggtttagggaaa 
agtaaaagtaattttataatcccagctgtcatttaagccacccctttgtgggtagcatatggtccactctctcagttca 
ttgtcctaaagatgcttcatcagaaaggaataacttccaccccgttactctctgtccccttactctgctttatttttct 
tcgtcaatcctaccaccaccacccactgtttgaacaacccactattatttgtctgtttcccatccctggtagaatagga 
gccccatgaatgaaggaactttgcttctgttgttcaccactgaatctctaaggtatggaacacacctggcatgtgatag 
gcactcgataaatatttgttgtggctcatgggcaccttgcagagttaaggctgcagttgtttgtggaatttataagtgg 
taatgaatatttatctactattcctcttccaaggcgatcacacaataatcaggctttacactatccagttcttaggtct 
tccaagttatgacttgtgaggtatgttaattatgataatagaaggcagtttatttggttcagatttattgatgtgtaat 
ttaccacagtaagacttccccrttacaaaagtatgatgagttttgacaaatggatacacatgtgtatctaccactgcca 
tgctccttttcagtctgtcgtcccctccacccatgaccactggtcaccactgcagtgatttctgtccccttcatttcac 
cttttccagaatgtcatataaatggaatcatgcagtatgtagttttttgtgtctggcttatttttcttagcattaggct 
tttgggattcatccaggttgtcgcatgtaacagtagcttattcctttttatggctgagtaagtgtcccagttttattta 
tatatttatttatgaggaggtctctcactctgtcacccaggctggagtgcggtagcgcgatctcagctcactgcaacct 
ccgcctcccaggttcaagcaattctcctgcctcctgagtagctgggattacaggcacccaccgccacgcccaactaatt 
tttatatttttagtagagatccggtttcaccatgttggccaggctgatctcaaactcttgacctcaggtgatccgccca 
cctctggctcccaaagtgctaggattacaggcatgagccactgtgcccagccccagttttatttattcaccagttgatg 
gtcttttcgacaactaattgtttccagtttttggctattctgtataaggcttctataaatattcacaaatacctaggat 
gggatgactgggtcatataatagtactgtataaccttagcagaaactgtcaaactattttccaaagtggctcttccatt 
ttacaattccacagtgtattgagtcccagtgtctccatacacatgctagcacttttaatatttaatttagtgggtatgt 
aatgatatctcattgtggttttaatttgcatttctctgcagctaatgatgagtgtttctgcttatttgggaaggtttta 
atttagcagtctgttgtattctgtagatattaataacttcaaaatatcagtggcatttgcagttaaaatttccttaaaa 
aattggccaaaggtttccagcagtcacttctgccatgcccaaactgtatgaaacaaggctgaggtgtggagattgtcac 
attttggcaaggagtgatccacttgggtgactgatgagacccagagagcgtacgcctcgggcttgagggtgaggacggg 
cgggaagtcgactgcatggccctgctggccttgggaggctgcccagtccttagctaaagctggcagttatgggaaacag 
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acttagattctattacgtttttcaggatgtcccaggagtcacctgggaagctcagcagtcctttgtgactttcaagcat 
atggtagaagctgctgaacacagagctccctctttggggataatttgcccaaatcatttaatcaggcttgagaaatgag 
ttaccacaggtccaggagtgctgccacccttgaattctgacaccctatttctcctatccgtctcttaattaattaagca 
cacatccccaactgcttacgacaagccaggacccttttgcatactaaggaaaacagggatgaaggaaacagaaatggtc 
tctgctctgactcagaaggtagaaatcctctttcccagccaagtcttcctagggagcacgtaggaagggctctgaaccc 
acgtgtcagttgcaggggaggatatcaggaaaggacattgaagaagtggagacctaagtttgagacctaggcattagcc 
aggctagcagtgcttgaaaaagtgtcttaggacaagagaactcaccagtgaagtcccagtggtaggagagcgtgcagca 
tattctgagcctgtatacacatctccagggcattgcttagcaggtggggagtggcaagagagtaggctggagtcacaga 
agggaggccaggtagaccttggtgagcactggactctatgttcaggtgctgaggacctggcaaaaggttttaagtcggg 
gagaggcatgttcagatatttggtctagctgagtaactttgggtgctctgtgacaaatggttgggagaccagtgaggtg 
gcagttgcggtcatctaggagcaggatcagagtggcctattgactgggatgactgtgaagtgggatcctttccagccag 
taactggaaatgtgtatgagggcagaagtgagtgtactgcatttgaaacattgagaaatctagtacatagtactgtctc 
ttttatatcttttttttttttttttttgattttggtttgtttgttcactaacttggaaaactgatgtggaaatgtccct 
ttggcttcagttacctgagcagaaggggccgggcattgccaaactctcctcttaggacagaattgctcccagtattgat 
cattgtgttctgagttgggggagcaaattgtgcaggaggccaggtcagtgccaaggtgggtgggaggaattggagcagg 
aagcttccctaagtgtgcccagcaaagccacggtagaactttctactgtggctctacgctacttcttagcaaccttctc 
catgtgcttcctggagagtccttggagtcagaacctttttcttgaaacccagacactttacttccaagaaaatgctgtc 
caagaaaactcatccttcccttcttctcatgaacgttgtgtagaggtgtgtcttctcttcctttgagcttttccactca 
gggtttaggggaggtgatattctatatttgggtttggctctgggtactgcaacactaggctattaagatttcatcctta 
ctgctttgcccctcctatctttccagaaacccacaatggatttgctagaaataatggaacgtcctgtttggacaggata 
taaccatttctcagctagaggatattgttggaatgaagaaagataaatggggagaagggaactcacattgctttggcac 
ttaaattaagccatgtactgtgttgggaaattatttatattatctcgttgaatccacagtagaacacagttgaacacca 
tacaaggtaagtattgtcatccttattttaccatgaggaaattgatgcttagagagcataaagccttggccaggggcac 
atagttgggaagccggggctaattcatgcctgggctctttctgatagttttccttttttaattgtcccctcctcattgt 
taccttggggatttcaagagattcatgtagcttctaaatcaacgaactgattcctggagagcagcttctgtatgagaaa 
aatctagctaattatttatttcagtgtctctggaatgcaagctctgtcctgagccacttagaaaacaatttgggatgac 
aagcatgtgtctcacaatgctgctctggttgccagtgctgtgctgccagttgtcatctttgaacaaactgatgcagtgc 
tggtttaactcttcctctttttggagtaagaaactttggaggcctgtgtccttctagaagtttgctgagcaaatggtaa 
ggaaaagaaataggtcctaaggcttgactatttcagagaatttcttgatttattggactgtcaatgaatgaattggaat 
acatagtggtaggctgtcttttcttctcagacactgcaatttcctccaatctcttgacttttctagaagttttaatcca 
agtccttgttgggtggtagataaaagggtattgttctactagagactgaccttggcatggagatctcatttggactcac 
agatttctagtctagcgcttggttttgtatccatacctcgctactgcattcttagttccttctgctccttgttcctcat 
gcccagtgtcccaccctacccttgcccctactcctctagaggccacagtgattcactgagccatttcataagcacagct 
aggagagttcatggctaccaagtgccagcagggccgaattttcacctgtgtgtcctcccttccatttttcatcttctgc 
cccctccccagctttaactttaatataactacttgggactattccagcattaaataagggtaactgctggatgggtggc 
tgggatacacagaatgtagtatcccttgttcacgagaagaccttcttgccctagcatggcaaacagtcctccaaggagg 
cacctgtgacacccaacggagtaggggggcggtgtgttcaggtgcaggtggaacaaggccagaagtgtgcatatgtgct 
gaccatgggagcttgtttgtcggtttcacagttgatgccctgagcctgccatagcagacttgtttctccatgggatgct 
gttttctttccagagacacagcgctagggttgtcctcattacctgagagccaggtgtcggtagcattttcttggtgttt 
actcacactcatctaaggcacgttgtggttttccagattaggaaactgctttattgatggtgctttttttttttttttt 
tgagacagagtctcgctctgtcgccatgctggagtgtagtggcacaatcttggctcactgcacctccgcctgccaggtt 
cagcgattctcctgcctcagcctcccaagtagctgggactacaggtgcctgccaccatgcccagctaatttttgtattt 
ttagtagagacggggtttcaccgtattggctaggatggtctcgatttcttgacctcgtgatccgcctgcctcggcctcc 
caaagtgctgggattataggcttgagccaccacgcctggccgatggtgctttttatcatttgaaggactcagt^^£.a^ 
acccactgaaaattagtatgtaaggaagttcagggaatagtataagtcactccaggcttgaggcaaaatttacaaatgc 
tgctgactttgtatgtaaggggaggcattttcttagaaaagagaggtaggtctctgggattccagtatgccatttccat 
cctcagtgtttttggccacctgagagaggtctattttcagaaatgcattcttcattcccagatgataacatctatagaa 
ctaaaatgattaggaccataacacgtagctcctagcctgctgtcggaacacctcccgagtccctctttgtgggtgaacc 
cagaggctgggagctggtgactcatgatccattgagaagcagtcatgatgcagagctgtgtgttggaggtctcagctga 
gagggctggattagcagtcctcattggtgtatggctttgcagcaataactgatggctgtttcccctcctgctttatctt 
tcagttaatgaccagccacggcGTCCCTGCTGTGAGCTCTGGCCGCTGCCTTCCAGGGCTCCCGAGCCACACGCTGGGG 
GTGCTGGCTGAGGGAACATGGCTTGTTGGCCTCAGCTGAGGTTGCTGCTGTGGAAGAACCTCACTTTCAGAAGAAGACA 
AACAgtaagcttgggtttttcagcagcggggggttctctcattttttctttgtggttttgagttggggattggaggagg 
gagggagggaaggaagctgtgttggttttcacacagggattgatggaatctggctcttatggacacagaactgtgtggt 
ccggatatggcatgtggcttatcatagagggcagatttgcagccaggtagaaatagtagctttggtttgtgctactgcc 
caggcatgagttctgatccctaggacctggctccgaatcgcccctgagcaccccactttttccttttgctgcagccctg 
ggaccacctggctctccaaaagcccctaatgggcccctgtatttctggaagctgtgggtgaagtgagttagtggcccca 
ctcttagagatcaatactgggtatcttggtgtcaatctggattctttccttcaggcctggaggaatataataactgaga 
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cttgttttatttctgcacagggttctaagccattcacttcccagatgggccaataatgctttoagtaatctggagatca 
tctttaatgcgcaggtgaatggaactcttccacagagggatgtgagggctgtagagcagagtgaactccctgaaactca 
cacgtcagctctttgtctctctatctctgaacacccttccttagagatcccatctctaggatgcatttctctgtagtta 
gtttctaagtctcttgttcctgttctgcctttatttttttttcctggattctaagccagtatccccacttggctgtctt 
aatgtagcttaacatgtctgtaatcaaaatgatcatctttctgagattcaaagggctataagggactttggagagaatt 
tcattcagttttcctcaaactagaataatgcttgcactgtctgtaaaagaacaaaactgtcaaagcatccttttgttca 
ctaaatttccttttttattatagtgttacttaaatattaggaagttaaaagtaggtataaacttcttataggctgttat 
tatacaactatatgacccatacatatttacaaattaagtgcagccaaaattgcaaaatcaataccattcaaattaatac 
cttaaatgtggtgaggcagctgttgttcaactgaaaccaaattataagttgcatggcagtaaatgctatcatgctgatc 
attttgagtttggccagtctatattatcatgtgctaatgattgaattctccacccatttttctacttgtatgaccttaa 
tttgatggcacctgttccatcctcatgagtttgctacaattatactggtgccaacacaatcataaacacaaatataaac 
ttgggctttgaaatcttgtgccagaacttggctttaaagtaagcatttaaaaaatccatatgtgtttattagactttgt 
ttagatgactgttgaaatgaaaacaaagtgtttaaaatcctcttagagaacttaaatataatccctcagcaatatgtat 
acagatcttcctttgagaaaaactgattgtgttcagcctctcatgttacaaatgggcaacctgaattctgaggtctcta 
gtgagagaacagggactggaatctgtggatcctatctgttttaataataattgtaaagtataatagataatattatatt 
aaaaagagagnnnnnnacacttagaatgagcttccatgtgtgaggcactaactgattaggcattattaactagatttat 
tccttttaaggccccgcgatgtactgctatttccacatgttgtagctggggaacgtcctactcagagaggttaagtaac 
ttgtctgaggtccacaccactaacaaggagcacaggtagggttcaaatccagataatctgactttggagctggcactct 
aactcaatgtgcctaatcgcttttcagtggtgtcattattttgcctattctccatctgagaatattgaagtttctgact 
ccttccttgcctttctccctgcctcccgtggttatccccaggtcttggtgttccagtcctctatgtccgtccttactct 
tattcctttgctacagtgtgatccagggctcctgcccttcttatcctggtagaggccgcccacttgctgggaaattgtc 
tccgccatgctttatccatgttgtgtgtccattagtgagtagtgggaagaatcatatcatgttggcaatgaaagggggg 
ctatggctctggggtagtctagtctgaactcttatttt 
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SEQ ID NO: 15 



Genomic contig containing ABCl exon 2 : 

cttttttttttttttttttttttttttttttgaggtgaagtctcactctgttgcccaggctggagtgcaatggagcgatc 
ttggctcaccccaacctctgtctcctgggttcaaacagttctcctgcctcagcctcccgagtagctgggattacaggctc 
ccgccaccatgcccagctatttttttgtattttcagtagagatggggtttcacccttttgaccaggctggtcttgaactc 
ctgacctcatgatcaacccacctcagcctcccaaagtgctgggattacaggtgtgacccaccacgcccggcctcataagt 
attttctaaatttatttacagtcatgccatttaaaaggaaagttgtattcctgtctttgttaatatttataagtgatttt 
attcagctacaagcttggaatggcatataattttgtattctgcttttttcacttaatattacatggctaatgatttctgt 
ctttcataaacattattctgatgatggcatgatatattgttgagtacatgtaccataattgaatcatttccctattgcta 
tgcaattaagttgtttccaatattttgcaattataatgtttcaatgaatgaataactttatgcatatagctttttgatat 
cttaagttcagtttcctaggatgaatttccaggaatagtaattgggcaaatgggataaacatgactcttgaatacgtatt 
gttaacattgctttcccaaagggctcaactgatttatatttccgtgttcattatcttttaaaccagctcatttactcacc 
aaacatttttaaagccattatcatgtggtaggcttagtaagaagaaagtgaccctaagggagaagcttatatataaatag 
cgtccctggtgtaccaagtgctgatacagacacaaagtacctggggaaattgagatcagggagtcctggctcagctggga 
gaaaagttcattttcatagagtcatggttttgttctttggcagaaagaaaattgctttcttccccacccccacccccagc 
tttattgaggtataattgacaaataaaaattgtatatctttaagatatgcaatgtgatatatatgtatatctcaacttaa 
aaaataagctacagaataaaaaggtgtttgctattaaaaaaaaagaaaaggctgaatgtcattcccaagcttggaaattt 
gagtatgttgcctctttgggattatttacagaaatattagcaagaccagccccatctttggtcttgagtactccactgtc 
agcatgctttcttccagagagggatccatttgcctttatttttcattctgttgtgccgtctatgcaaactattcttgata 
cttttatggtaacagtgtttttttgttccatgagataaatttatacatgctcattgtggaaaatttagaaaagacaggaa 
agtattaaaaacatcmcytttttttttttttttttttttttttttttamgcagacagagtcttgctctgtcgcccaggcc 
cgagtgcagtggcgtgatctcagctcacagcaacctccgcttcccaggtttaagtgattctcctgcctcagcctcccaag 
tagctgggagtacaggcatgcaccaccacgcccggctaattttgtatttttagtagagatggggtttcaccatgttggcc 
aggctggtctcaaactcctgacctcaggtgatccgcctgccttggcctcgcaaagttctgggattataggcaggagccac 
tgcgccagccacacctacgttcttatcatcctagtacatccactgtcattatcttgctgtatttccttctgcccagtctc 
actctgatcatgcagtggcgtgatcatgcagtgatctcggctcactgcaacctaggccttctgggttcgagtgattctcc 
tgccttagcctcctgggttcaagtgattctcttgccttggcctcccaagtagctgggattacaggcatacacccccatgc 
ccatctaatttttgtatttttagtagacacagcgtttcactaaaattttgtatttttagtagagatggggtttcaccatg 
ttggccaggctggtctccaactcctgacctcaggtgatccgcctgccttggcctcacaaagtgattacaggcatgagcca 
ctgcatccatcgccaaaaagattttttaaaagagtttaatgtagaaccatatcaaaggtctttggaaataaaaaacagtt 
ttttaaaaatatcagaaataaaacaacaaataaataaataaataaaaacacccaaaacaatctgaagcacgagcacctag 
cagaaaggttcaattatgatctattcatagagtggaatatcaagtagacattacaggacatgttttaagattatatttta 
tgtcatgggaaatgctctcccagtatgatgttaaatgaaaaaacagaatacaaaagtatatatgctgcatagtctcaata 
ttgtagagaaaaaatattatttatgtatgcatgaaaaaagacaaaagatgttaacagagatccattgttacttcagttta 
ctagggattgtctctgggaggtaggattaaggtgatttatatttacctttttaaacttttctgtatttttttattttcaa 
attttccataaaaatataaggacttgaagatcaagaaaaaatttctgctttggctcagtgcagtcgtcacgcctgtaatc 
ccagcagtttgggagccctaggggagaggatcacttgaacccaagagtttgacgttccagtgagctatgatctccggatc 
gtaccgcctggacgatggagcaagaccctgtctcaaaaaaaaaaatctttgctttttrtttttgtttgtttttgagacgg 
agtctctctctgttgccccagctggagtacagtggcacaatctcagctcaccgcaacctctgcctcctgggttcaagcga 
ttctcttgcctcagcctcccaagtacctgggattccatgcacccaccactatgcccagctacttttttgtattttcagta 
gagacagggtttcaccatgttggccaggctggtctcgaattcctgacctcagctgatccaccggccttggcctcccaaag 
tgctgggattacaggcatgagccactgtgcccagcccaatcttttgctttttttaaaaaaagaagacaaaaagggatttt 
ataccagtattatcttggctgtgtgactctgaagccacagttgtaagttataattactctgaaacacaaggccctgtgac 
tcttttgggctctttggtgtttatcttgattacaacgttggaatatagaaatgaaaggaatgggagaggtgatagacttc 
aggcagtgtaactagttgtctgaacactactggctcaattatattgtgtctagtgatttccatcttgtccgtctgctaat 
ttatcGcctggtaactcactgaggcagggttttcctttggagaaacctcattgttttaaccagtgtatcatgcttgttta 
gaagttcaatgatctttttaactcatcggagaagatgatgaccagacctggacagatggggaaggactttgcactctctc 
tttacagtcctgagtgcacacaggtcaatatggaactatgtgtgaattttcattgtctttgagagccctcttctctgccc 
catagqgagcagctttgtgtgcaattagaggagcaagggttgtgtgtatttagcacagcaggttggcctggtcctctcct 
ctcaacatagtcaccacatacctggcactatgctaaggctgggaatgcagacagatgggtgcctgctttcagagtgctca 
^tQ^gctgaggaagccagcaacagaaacagatgatttcaggagctccaggaaaatgctacaggaggagtgtgcctgggtt 
actggagtagcacaggaggagggcttctagctcaggctgagattttagtaaaggaaattatgccacgatgaatcctgaag 
aatgaatagaagtgaaccagataaagcacgataggaagcatcttcccttacctaagggaagacacagaggtatatggaat 
ggtatgttaaaaggttgggactccaaacagttctgttaaagcttagagagtggtgggagagactggagaagttgattaat 
tagtaaatgaagttgtctgtcgatttcccagatcccagtggcattggatatccatattatttttaaatttacagtgttct 
atCttatttcccactcagTGTCAGCTGCTGCTGGAAGTGGCCTGGCCTCTATTTATCTTCCTGATCCTGATCTCTGTTCG 
GCTGAGCTACCCACCCTATGAACAACATGAATgtaagtaactgtggatgttgcctgagactcaccaatggcagggaaaat 
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ccaggcaattaacgtggcctaaattggacttttccaaagatgctgtctttgggaaacatcacacatgctttggatcagaa 
aacctaggcttctaatttgttgataaggcatgaactcaggagactgttttcagtcctagtgaatggtgataattgtaatt 
ataacagtagacaacatctcttttacacattttaaatcatgaaaatagaataaccttactgataattttagaaagtggtg 
attaaaagcacatttaagataatgccttaacacctagtcttttccatatgcatgatctcttaatcacacattgcaaatca 
tggaacacagaatttt 
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SEQ ID NO: 16 



Genomic contig containing ABCl exon 3: 

atcttacaatcacagtctttctcttagggctgggctcagtgggtggattgacactccagaaatggccagatctaaaggat 
caacatttacgtagctgggaaatgtagctgggacttcagtttcactgccctagtgatttttcctaccactaagcagctca 
ctccatacccctacgagacccacaagcttatgagatactgttcttccaggaaagcactggggccagggccaccttttaat 
tgtgtttcttggcctggtcccatctttctcacaatatatagcaacagttatttacttgctgaCtttctaatgcacatcac 
acatagtcatattaaacacacacacacacacacacacacacacacacccctcaagaaacattttctgagacgtgatttcc 
tgatttcatcaaaaaagaaaagagcgggccaggcacagtgggaagtcaaggtgggtggatcacttgaggtcaggagtttg 
aaaccagcctggccaacacggtggaacctcgtctctactaaaaatacaaaaattacccaggcgtggtggcgcacacctgt 
aatcccagctactggggaggctgaggcaggagaattgcttcaacctgcgaggctgacgttgcagtgagccgagattgcgc 
cattgcactccagcctgggcaacagagtgagactctgtctcaaaaaaaaaaaaaaaaaaaaaagcataaactgaaattta 
tatgcaatttatatgcctgtgagataattctgttttctcttttggaaccccaaagagatttttttgattgatgagcaaat 
acattttagattttatttaagcattatgccaagcaccactgaagtataagtttcaagggcaaactcagttttttcatcta 
ctagacgaatgattttctggaatgattacaagcaggcaagatggtgtagtggaaatagcaaatgtcttcggcatcagaca 
agttggggtttgtttgtatcctgcctctgcccttcaccgaggttgtgatcttgggcagattgttgagttttaacctagat 
tcctctgactccagatcataaattttcagaaaagttctgaaattcttgtatatactgatggtaaatgagacttttcctta 
catctatgcacttctttgtttgtttgttttgagatggtcttgctctgttgcccagactggagtgcagtagtgcaatctcc 
gctcactacaatgtctgcctcccaggttccagtgagcctcctgcctcagcctcccaaatagctgagactacaggcatgtg 
ccaccacgtccggctaatttttgtatttttagtagagacagggttttgccatgttcaccacactggtctcgaactcctgg 
cctcacgtgattcgcccgcctcagcctcccaaagtgctgggattacaggcatgagccaccatgcccggccatatccatgc 
acttcttgcaaccttaccttcttttctcatcaccctccagggacctagttggaagagcagagttaaaagttaaggtgaaa 
cttggagaggtgtcttgtccctaggaacaaaggactggtttgaaattctctgtaaatcttccccagttcaaaccagagtt 
atcaaggtcttaaaaacttccctgggtcctgagagcccattatattatttacttgtcttcctgtacacccactgcctagt 
cctgatcctacttttgtttgcaaataggatggggcacaacgtacaaggaagggcctttgccacccctgctaagggataac 
ctgaaataccttcaccatcactgccctgtgctgcttttcacctatgccagtctgtctacagtgccagtgtctcctggcat 
tgaaaggggagaatcttttggtcctttgagtatttggttgggttacataaatctccctgaatgaagagcagctgacttag 
gcaaggggccttgtttggttttccttgaactattaacaggaagatagggagattaactgtgtaaatgttcaataggccag 
agtccctgcagagggtggccacagtgatcagatcttatcacatccttgctttgggtgttgcctctctggttggagtatgg 
atagaaaagaaagaaagaccctatattgaaatgcaaagtgcagcaagtcctgactttggattaacttctcagcccatttg 
catgaaaataaaaagatgaataaaacaaggttcccactttggagggaggtggtagctgtgagatggaaggagtgttcctg 
ctgggcaacagcagagtaagtgctggggtagattcactcccacagtgcctggaaaatcctcataggctcatttgttgagt 
ctttgtcctacaccaggcactctgcaaaaacgctttgcctgcaaggtctcatgcgatgctcaccacagctctgtgaagtt 
aattgtacttttatcaccattttacagatgagaaaactgagggtatggggtcaatgacttggctaaagtcactgcttagc 
aagctgcagggactggatgtgaattccaattggtttgactccaaagcctgtgaagctacttgttcttcaccacctagagc 
tgtggttcttgataactgtgaactcttttggggtcacaaatagccctgagaatatgatagaagcaggagctctggccttt 
ctgtccatacctgaacaggtccttgggttaagagcccctcgtccagggcctattaatcttgatcctcataagcagcatcc 
atgtattacggccgcaaaccaaactgtgccagaccgaatcctaggaccaagcccaaatatgtcccatcatccttttggta 
agaagctcattgtaagaaagaaagaggagagcaagaggatgacctagtgcatggggcctcattgttttaattagtgacaa 
aacaacaataataacaacaaaacccccgaagcttcacagatgacatcagaccccaagcctgtgtgtttttcaggtgccct 
tgaggagctttgtagctggcagaggaggtgaaactgacaaatgtttggcagatgcaggagagtaccagaggggtttgaga 
tgagctaaattccaatctaaccgcagtgttgaggaagaggcttggattgggaccatggagatgggggttctactcccagt 
cacgccagctgactttgcgagtgttctttgtcagtcactttatcttattttatttatttttatttttttgaaatggagtt 
tcgctcttgtcgcccaggctggagtgaaatggcgcgatcttggctcactgcaacctccccctcctgagttcaagcgattc 
tcctgcctcagcctccagagtiacctgggattacaggcgcctgccaccaagcccatcgaatttttgtatgcttagtagaga 
cagggtttcgccatgttggccagggtggtcttgaactcctgacctcaggtgatccgcccaccttggcctcccaaagtgct 
gggattacaggcgcgagccactgtgcccagcccacttcatcttaccgtagttacctccttagagtatgaaaaaataggct 
tagggcatccccaagtcccctctatgtctgagagctgaggctggctgtcaaagaggaactaaggatgccagggactttct 
gcttaggacccctctcatcacttctccaacgctggtatcatgaaccccattctacagatgatgtccactagatcaagaat 
ggcatgtgaggccaagtttccacctgagagtcagttttattcagaagagacaggtctctgggatgtggggaatgggacgg 
acagacttggcatgaagcattgtataaatggagcctcaaaatcgcttcagggaattaatgtttctccctgtgtttttcta 
CtCCtcgatttcaacagGCCATTTTCCAAATAAAGCCATGCCCTCTGCAGGAACACTTCCTTGGGTTCAGGGGATTATCT 
GTAATGCCAACAACCCCTGTTTCCGTTACCCGACTCCTGGGGAGGCTCCCGGAGTTGTTGGAAACTTTAACAAATCCATg 
taagtatcagatcaggttttctttccaaacttgtcagttaatccttttccttcctttcttgtcctctggagaattttgaa 
tggctggatttaagtgaagttgtttttgtaaatgcttgtgtgatagagtctgcagaatgagggaagggagaattttggag 
aatttggggtatttggggtatccatcacctcgagtatttatcatttctgtatgttgtgaacatttcaagtcctgtctgct 
agctattttggaatatactatatgttgttaatgatatcatgcagcagacgtgcatctgaatgggctggctctaggagcta 
gagggtaggggctggcacaaagatgcatgctggaagggtccttgcccataagaagcttacagccaaggctaggggagttc 
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tgtcttctctccatcagctcacctctctcacctctgtcactgccccatcagactacaatgtctgcaggtctttctcccct 
aagtgtgagctccctgagcaaagcaggatgctgccccttccctttgtattccttgctccttgcttcagtgcctgtacata 
agtatgggcataataagtgtcccccaaatgagacattgaggattcttcaaatgcacaggaccgtgatgtgagttaggacg 
cagtaaggacgatgggatgtggctcatgacaatcctgaggaagctgcagctgcggcacgcagggccacactgtcatgttc 
atggaccctagactggctttgtagcctccatgggccccttccatacac 
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SEQ ID NO: 17 



Genomic contig containing ABCl exon 4 : 

tcatgactgccattggtataaagatgaatataatccagaccagattcatgattattcatacatttttagtgtattaactt 
ttaattctgcttttaaaataaattaaaacattctaatatgcccttaagagtatcccagcccaggccactgagcctactgt 
cgttcatggataagtttgcccctgggggcatgtgtgtgcatgcatgtgtgtgcacatgcatgatgagccgggccttgaag 
cgtggtaagatttgggtgtgtagaccaatggagaaaggcatttggggcagtgatgatgggtgggggagggaacatggtga 
tgaatggagctgggtgtggggagccatgggagtgggttagggccagcctgtggaggacctgggagccaggctgagttcta 
tgcacttggcagtcacttctgtaaagcagcagaggcagttggcctagctaaagcctttcgccttttcttgcaccctttac 
agTGTGGCTCGCCTGTTCTCAGATGCTCGGAGGCTTCTTTTATACAGCCAGAAAGACACCAGCATGAAGGACATGCGCAA 
AGTTCTGAGAACATTACAGCAGATCAAGAAATCCAGCTCAAgtaagtaaaaaccttctctgcatccgtttataattggaa 
attgacctgcaccagggaaagagagtagcccaggtgtctggggcttgttcccattagatcttccccaaggggtttttctc 
cttggtggctggcctgtggggcccctctccaggaggcattggtgaagaaactagggcagctggttgccacagacagtgat 
gtactaatcttctctgggaagacagaagaaaagtccccagggaagaatactacagacttggccttagggacagctagggg 
tgcagattgctgccaactgcattttttctgaagttggccatatggttgcagtgaatggatttatagacagagtatttctg 
tgcatataagagcaattacagttgtaagttgatatggataagtgaaagttaagcacttctttctaaaaagagaatgcaat 
tcattttcccctaatcatttcaattagtctgatgggcatttgaacttgttgtctttaaaaagtgaaatctttacctctga 
tctggtaagtatccaggcaatttcttgtgtgccacccaggaggtatctggggagtGcgcattttctgactgaggcattgg 
ctgccatagcatcagagcagccttccaggcagtggcctggcaaggggacagaggctggtgggagcagctggctgagtgca 
gccagtaatggcatgt 
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SEQID NO: 18 



Genomic ccntig containing ABCl exon 5: 

agctctccagctgattctgatgcatacttaagtttgagaaccattgcttgttttgcattaaacaggagattagtctctgc 
agcttgtgggaataaagctttaaatctctccaattttagctctgtgaaaaggcagtggggagacaggaatgaacggacta 
ctgccacaaacctcaggtggggtgggtgagatcatttagaagagaaagaccgggcatggtggctcacgcctgtactgtca 
ccactttgggaggccaaggcaggttggatcacaaggtcaggagtttgagaccagcctgcctatcatggtgaaaccctgtc 
tgtactaaagataaaaaaaaaaaaatttgccagucatggtgatgcatacctgcaatcccagctactcgggaggctgaggc 
aggagaatctcttgaacccgggaggcgggggttgcagtgagctgagattccaccattgcactccaacctaggtgacaggg 
tgagactccgtctcaaaataaaaaaaaaaaaagaaaaggaaaggctgtgtgtgtgtgtatgtgtgtgtgtgtgtgtgtgt 
gtgtgtgtaacagcaccatcacactgtttgagttgaggagcacatgctgagtgtgcctcaacatgttaccagaaagcaat 
attttcatgcctctcctgatatggcgatgctcccctatctcattcctgtgtgtgtttagccaggcaactgttgatcatca 
atattatgataacgtttctccactgtcccattgtgcccactttttttttttttttgagttacttactaaataaaaataaa 
acactatttctcaat ag ACTTG AAGCTT CAAGATTTCCTGGTGGAC AATG AAACCTTCTCTGGGTTCCTGTATCAC AACC 
TCTCTCTCCCAAAGTCTACTGTGGACAAGATGCTGAGGGCTGATGTCATTCTCCACAAGgtaagctgatgcctCcagctt 
cctcagtagggctgatggcaattacgttgtgcagctactggaaagaaatgaataaacccttgtccttgtaatggtggtga 
sggggagggaggtagtttgaatacaacttcacttaattttacttccctattcaggcaggaattgccaaaccatccaggag 
tggaatatgcaacctggcgtcatgggccagctggttaaaataaaattgatttctgccttatcacttggcatttgtgatga 
tttcctcctacaagggatacattttaagttgagttaaacttaaaaaatattcacagttctgaggcaataaccgtggttaa 
gggttattgatctggaggagctctgtctaaaaaattgaggacaggagactttagacaagggtgtatttggagacttttaa 
gaattttataaaataagggctggacgcagtggcactgagttgagaactgttgcttgctttgcattaaataggagatcagt 
ccctgcagcttgtgggaataaggctttaaatctctccaattttagctctgtgagatggcactggggaaacagaaatgaac 
ggactagtgtcacaaagctcaggtgggatggacgagatcacttcaaaggtctgtaatcccacgtctataatcccagcact 
ttgggaggccaaggcgggaaaatcacttgaggtcaggagttcgagaccatcctggccaacaatgcaaagcctgtctctac 
taaaaatatgaaaattagctcagcgtggtggcatgctcctgtagtcccagctactcgtgaggctgagacaggagaatcgt 
ttgaacctgggaggcggaggttgcagtgagccaatatcacgccattgcactccagcctggctgacagagtgagactccat 
ctcaaaaaaaaaaaaaaaaaaagaattttataaaatcaggaaataatattagtgtttatgttgaattttaactttagaat 
catagaaaacttcctctggcatcattattagacagctcttgtgcagtgggtagcaccagacccagcttgcatggttattg 
atttttcagagacactttttgagcttattctctggcagaaaggggaactgcttcctcccctatctcgtgtctgcatacta 
gcttgtctttacaagaagcagaagtagtggaaatgtttattcttgaaaataagctttttgcttcacatgatctagaattt 
ttaaaattagaaaaatgtgcttactgcg 
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SEQ ID NO: 19 



Genomic contig containing ABCl exon 6: 

agtaaaatggagaattccaaattctgaaattgttagaacatagttctgtgtcttacttaaatatcgacacttacagataa 
atagcataaatgctttctccccatatttcagcccagtcctacttaaagacaacataaattgcaaaatagtgaggatgttg 
ttcatctaataaaagtggttccaggaattcagactctggattcctgtttgccaaatcatgtgtcccactcttaagaaaac 
gagttggactntggatttttctttgcaagagggacaagagtgtgggagatactgagttaatgcaacttgcaggttttaag 
tgtcctctcattgtgccttgtgctttgatacattctgagtttcagtaaagagacctgatgcattggactgttgcaatgga 
acctgttttaagatcttcaaagctgtattgatatgaagttctccaaaagacttcaaggacccagcttccaatcttcataa 
tcctcttgtgcttgtctctctttgcatgaaatgcttccagGTATTTTTGCAAGGCTACCAGTTACATTTGACAAGTCTGT 
GCAATGGATCAAAATCAGAAGAGATGATTCAACTTGGTGACCAAGAAGTTTCTGAGCTTTGTGGCCTACCAAGGGAGAAA 
CTGGCTGCAGCAGAGCGAGTACTTCGTTCCAACATGGACATCCTGAAGCCAATCCTGgtgagtagacttgctcactggag 
aaacttcaagcactaatgctttcggaatgtgaggcttttccttggacagcatgactttgttttgtagaaaagtacggctg 
gctgggagtttgtgatataatttagttcagtggtattctaagtgttcttagtgttctttcagacttttgggccatctccc 
aaagggtgaatgggaagaataagctgggtgtggctgagtttaagccaaaagttttttgtgcttgtttcaatcagagaaga 
cctgctttttcatgtttttactattataatactaagcaagagctcatttgaaaacagagttcttcatatttaaaaaaaaa 
aagtcttgaaaccattgatgggaagatggatatctatttatgtttaaaaacccatcataaagatgacattgtgggctgtc 
acagttggaaggccctcgaattagatgagaccacactatttagcttacttagtaataacattg 
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SEQ ID NO: 20 



Genomic excn containing ABCl exon 7 and 8: 

ccgtttggcaaatgctcagtaaaagaaaagggttagaaggggagaaaggcattttatcccaagccttcaggaatcaggat 
gaggatgtcttcaccttgtggtggggagtaattatacaattagagacagcacattggagtgtggctgatatgctgtgtga 
tgatagctctagctctctgcctagcagaggaaggacatttcaatagaagaaaaagtttaagaccttgccgagaaacagag 
aaaggatgttrgtctttttaagaagttgaaaaccctgtttgcagacaaaagccctccagttttggcagtaaactttcatg 
caagggaagaaaaaggcaggggatgacattgttgacaattgtgaggaattaccatgtgccaggcactgtgcgaggggctt 
tgtacatatcctctagttttagtgcttataaaaactctgtgatatgtgcacagcattttaaactttgctgcatagtcgag 
aaaatggaaggatggggaatttgagtcatttgcccagggttctatagctaccccaggttcccatgactggagaattgggg 
cacagggtggcgggggagagtgagtgacaagaatcctaacaatcttatttccattgagtccttataaaagaagtggatta 
actaccacgtttttaagtttttcttaaatttaggttatgtggatctggcgtttcttgttttgtcctgggtttgttttgtt 
tttgctatgctgtcttgaacatctgtcatcttgtaggcctaacggtaaacacaaaaacactttacctcctatagctttca 
attaaqatctctcagtttgtgtttgtaatagttttccaggcaagttctccctaggttcggcttctagtgtgttaaccttt 
agttataaagtgaacccaaagagagaaagtagaaacaaaacacctcacctgtttttgctcatgaattactctctatggaa 
ggaacaatcatgaacacctctgcgtatcacagaggcctatctgagtctgacgtttaagggagaccgcgtaggtccctttg 
aggactgtgaatgtgggagtcctgggactctggtgaagaacccgttccagaagagatgaatgagctggacaagttctttc 
atagaacctttaggcaggttttcttagaaatgcacattgaggattatgcttggatattgtgatgatcagaatgatactca 
atcccttctgcatttggaattctctttgaaagaaaacatcccaggcagctatttctcagagatagtgagtcccagccact 
tctagacattttcttgtgtagtctacattataatttcacagcagtctctgatatgacaaatgtcaaaatagcccaacctt 
ctctaaacttcagagatgtctgatatgatattgaataaaacaatgctcatagaaacatcaagaaaggtggattttccctg 
gatacttttttcctgcttgacaaataacagtgaagaaactgatctcacgtctttttctctttggaagcctgaacactcag 
aacccaacttgaggctcctcagctatagcaattctgacttcacagtctgtaaattattgttcttttttttctttagctta 
tgctttctgccctaatttatcttttccctgttctaatgaattattgtcctatatctgctgtgcagttaggtgacatataa 
cagcaattaaatatatgaattggtacatataaagatttgactaaaactcgatgtaaaaataagtgttctacattcaattt 
ccagtgttagaaacagtgctgacttgaacagagtgacagaattccatctttccctatttttgacagctttaaactttata 
ttttcttcctttcttgtgagccgtcattaacttgtttctcaaagccattcccgtattacccatcttgcagacgcagacag 
atttgggaatttgcggtcagagttgtattggacacatccccccagcccacatgagatccttttaatctattgcatattaa 
ctagttttaagtacaatattcctacttcatttaaaaccattaatcaaagaatgagtttgaaaatgaacaaaatgcaaact 
tacagttagaaataattgtagtgtctttagttttggttaggagtcggtttcttgtttgttaaactcaagattgtgaacag 
ttttaattcacttgtttatttccaatagagatttcaggtttacatttgaattcagaaacaaagttttctttctcattaca 
gAGAACACTAAACTCTACATCTCCCTTCCCGAGCAAGGAGCTGGCCGAAGCCACAAAAACATTGCTGCATAGTCTTGGGA 
CTCTGGCCCAGGAGgtaagttgtgtctttccagtaccaggaagcggatcatccactgtatcagtattttcattcctgagt 
ctggcaagaggtccttttgagttgaatatcacatgggatgtaatatcaattttcaaagtataagtgatgtaaacaataat 
gttttgatttccttattttagaaatgaagaaacctaaaactcatagatgtctcagagctaattggttagtggctaacagc 
tggatatctagtttagaaccttctccattttttctttttgcccctaggtaatcatacatttgtaaagaggagaattatct 
ctgccactgcccatgcactgcttttgtctgaccagcaatttctccatattgcttcttcagtagcaaggccaatcatttta 
ccaacacacatgcttgctaactaacaggaataacgtggtacccctaattcagccctttcccttgaaagcatctggcttct 
gaggttcaactatgggaatatggtctcttaatgaacattaagttgagtttgccttttaggtccacatgttgacaaatgta 
tcagagtaatctctgtcctaggatcagagggcctgtaggcacttgcaaaagcagttagctctgactcccagccagtgcac 
actccacctttctgactcccagccttgtctcaaattaggcttggaagcgaggaactgtctggtgtcccccagcataggaa 
gctgagccagggggcagtgctcacaaacaatacagactttaacgtgtaggatattggaaaataataatttgtggggaaat 
tgtctcagacttggtccacccttatttttagctgcttctctaatccgtttttctttttttggtgcttgtatctaacctac 
ccattttttggtgcttgcatcattttttcaaatatcaaaaacgaactttatgttttctaacaatgaaagtattgcatgtt 
cattgtggaaaatgctgaagacttggaaaatacaaaaatgctgagatcaaacactattgatacgttagtgtatttcttcc 
tgtcctgttctactttcttrctttgaattctgctcacgtgtttctgactgatgaggtctgacttttgggttccttttcca 
gaggagaagccttctttcagcttgccatttgttaccctggttatgaaggctggtaaccttttttactaggtagagaagct 
ggaccaactggggttcttccagggggagaatgagaaagagaaactgttttgcaagtccgtagctatttctctagggccct 
gttagctgacattgacatgccttgcattgctctgcagatcccctcgcagccctctgtcccttgttcatttctggccttag 
agaaagcaaagcagggtctctaacaggggaggctgcctctaaactcagggtttggttacagctgttttcacttacatcac 
tggccctggttttttttttttttctggcattaaaaaaaaaaattggaagcaggtgatgttcccattgctgatgtggtgga 
aactctccaagtgaacaatatacgtttttcttggcagctgtttcttgtgccctgcttgctcctggtccaggacaagcaag 
gaccatctgcctctttcaatagaacacctccagatccctttgatcaaaagttactcattgtctgacttgctatttctgtg 
agataaatgggagaagatcaataaatgcacttgtttgtccagtcagcgtgtggaaagttgataattttgaccaaagcaca 
accctgaaaggaaaagaaaaagggagtgaatgtcttctgagaagctgcctaggttcagacagtgtcacccatttccctgt 
atgctccacatgacaaacctgagtgggtctcatcatgtccattttgcagatggcaccaaggctcagaaaggttaggcaac 
ttttccagtcacccaatgagttaattgacaaaactgggattcaaacccagaactgttggattccaaagcctgtgttgttg 
cctgcttcgtgaaaaactccagtagcgactggaatagaaaggagaaccttccaagaaagaaaatacgcactagcagaacc 
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tcgaaattggcaggaaatigaggacttgaggaataagatgaatgaaagctgacctgagtttcacatctgggtgatgggaag 
ccaggacagggaggcagcatctcagatgtccacccagcaccgaccagctgcctggcattgctaggtgttgaggactcagc 
agtgaacacgctaacttctctgctttcttggggcacgtatagggtgagagacagaaacaaacaggtcagtgtacaatgcc 
acaggagggatatatgcagtgaagaaaaagcagggtaaggggcatagagcatgagaaggtgctttttttaaaggggktga 
ttaggaaagctctctctaaggtgacagttggacctgaaggagatgatagcatgtctgtggtgagggaaggaaactccgaa 
caggaagaatggcagatacaaagacattgatgctagagcatgcctaaggaatgtgtttaaggaccagggaaagtgagcaa 
gtggtcgggccaggagaggagctcagagcaggaggaggtgagtgccatacaggcctggcaagactttggattcctgctgg 
ctgagatgagaatccagcggagggcttgagggaggggacatgatgtgatctagagtttagactgtttacactctggttgt 
cgggttcagaagagactgggatgggggaaagggaggacaaaggacattgtgctggattgagaaagcagtaagtcagtttc 
attcattcactcaaccgatgatgttcaaataccaccatcatccgtgggctaaaggatgaagagccatccctccctgagag 
tcaggaagcacttcccagataaagtttggagtgtgagctgaggtgtaggagaaagagtaagagtttacccctgaaacggg 
tgctgggaagagtcaatagtttggaataactcaataatttatggtgcttctttagaaagatttgctggctttatgtggga 
agaaatttktttttttgattggggagtggtgggttggtggtgaggctgcctgtggaaagagaagtgagtgttttgactca 
ctgttatttaaaaatctctagggctgttccaataagcaacaaaaggcaaaatggcctggttctctgtcccctttctgtct 
gtatgcctcgtacaggttatgaaaagaaaaagttgggaaaagctgtccacctcacctaattgtgttcttgtggagtgtgc 
tagatgccccctctctggagaaaaaaaatccttgtggcctctgacccacctctggagagcctagttcccttctggaggca 
gaaggcaaagcttaggacctagagagtgctggaccacgccactcacaggaaccagcaggctgtgaggttgaaagctaggc 
atatggagctttccaggctgggtgcagggcctcgtggcccttcccctcccctctgtgctctatagctcagtcttcccagg 
cggtgtgaacacgcagtgacatttccaggaatacagggatttattaatgatttcttgtgaaatgtttggaaatacaaagt 
actctataaatatttcataatagcattggggctgagaactccacaaagtgccggaatacatttgcatgtaagacagaacg 
ctgcctgggtcattgatgcctgttgagtggcagtcacagacactgcctagggtttctgactcacgctgttgggactgttc 
tatgcagggcaccctcttgtgtggcataggatttgtgcctcaccacacactgttgtagctttgctgtcttgatgatgagt 
agagggcagtgtccaggccatggtataagcatctactgccccccagggttaccaaaaccaagccaagttgtgtctcagcg 
agctccgtgaagcatggagaagttgagtactcagagacatgacgtgacttttcaaaggctgtaagctgacgagggacata 
gctagggttcagacttgagtttttctttttctttttctttttcttttttttttaagactgagtcttgcttttgtcgccca 
g9Ctggattgcagtggtgcttggctcactgcaacctctgcctcccgggttcaagcaattctcctgcctcagcctccccag 
tagctgggattacaggcacctgccaccatgcctggccaacatttttgt at ttttttagtagagatggggttt caeca tgt 
tggccaggctggtcttgaactcctgacctcaggtgatccacccgcctcgacctcccaaagtactgggattacaggtgtga 
gccactgcacccggcccagactcgagtttttcatcttaatgctttttcattgcctgacactttactgagaccaagatagg 
gaacttcacatacagtaccttttctcccaaggcggaagagggctgttcaatttctacactagagttcggggagttttaga 
aatgagtcagttatcgaggatgagagcagttcctgataggctcaaccacaatgagatgtagctgttcagagaaagcattc 
ttttatctataaactggaagataatcccggtgaaacgaagcccagccccaggggcttcactaactccaggctgtgcttct 
caaactttagtgagcataggaatcacctgggcatcttgtgaagctgtagatttgaattctgcaggtcggcagaggggtct 
cagaatccgcatttccaacaatgtctccagtaatgctgatgctgctcgtccctggaccacagattgggtagccaggttct 
ggcaagctcatcccaaggctttgagatgacatcagacaaaatatgttctgggacatggcttttgagaggtcaagaaaata 
aqatgtttctttctcttctcatccccaacccttgcactgcccttttctcccttcccctaccctcctttctgtccccatcc 
CtgacgccagCTGTTCAGCATGAGAAGCTGGAGTGACATGCGACAGGAGGTGATGTTTCTGACCAATGTGAACAGCTCCA 
GCTCCTCCACCCAAATCTACCAGGCTGTGTCTCGTATTGTCTGCGGGCATCCCGAGGGAGGGGGGCTGAAGATCAAGTCT 
CTCAACTGGTATGAGGACAACAACTACAAAGCCCTCTTTGGAGGCAATGGCACTGAGGAAGATGCTGAAACCTTCTATGA 
CAACTCTACAAgtgagtgtccatgcagaccccagccctgtccccaaccccatccctcccttagttctggccttggcctgt 
gtcatctcctccctctgtagcagcgttagatgtctacatgcccatttgcccaccagactgagctcttcctagaggagaga 
ggcttctcttgaatagctacctgtccccagttctctgaatgcagcctggcacatctcaggtgcacagtagtgtttatcaa 
tggaatgaatgattgacagccaaccttctggttttctgggggatgtggaagggtggcttccagggtgatcaagaatgaga 
taatggcagaaggacaaatcctgcaagatctcacttatatatggaatatatgtaaggtagaaagtgtcagtttcacatga 
tgaataagttcctgggatcttgatgtacatcgtgatgactatagttagtaacactgtatagtatacttgaaatttgctaa 
gagagtagatccgaagtgttcacactacacaaaaaaggcaactatgaggtgatggatttattaacagcttgattgtggtg 
atccttttacaaagtatacatatattaaaacatcacattgtataccttaaatatatacaatttttatttgtcagttgtaa 
ctcaaaaaagctagaaaagcatttttaaaaaggatgatgtactggtcttaatattaccattgagataagctttataataa 
cataaaaagaaataacagtaatgataatagcaacaacaacaacaacaaagaactaacatttaagtagaatttcttgtgca 
ctgtgcattctgtttaagttatctcattttaccctcatgataacctgcagggaagattctttaaccccacatttcatagg 
ctcagagaggttaagtgccttggttagagccacatcagagttaatccacaagagccaggattcaagcccaaatctgcctg 
gatctgtgctctctaagataactgttagtggtggcgtgtgtgttctcacactcagacatttgatctgccctttgtttccc 
attcttagctgcaaggcagtgttaaagaaccctgtgtctccatatccactccccacacttaagcacttttgtgggcccgt 
gtgccgtatgcctcgtggcagcagggatccaatgtcacagttttaggcagtggcatccttttccttgaaaacttgatgca 
ggggaacctttctccatttccaaccacaggtgtgtctttcagacactgagtgaggcaggttttgtactttattgtaacac 
aagaaccttttcttctctggagtaaagcactccagacattcgcaagttgctttacaagccttaaaaggatggtattgtag 
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gcaactttaattaaatcccatctcctcctctcccccagcttgcaagttgacccaaggaagccttcatttccatgacagac 
ctaattgtgagggcatcctca 
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SEQ ID NO: 21 



Genomic contig containing ABCl exons 9 through 22: 

actgtgttagcaaggatggtctcgatctcctgacctcgtgatccgcctgtatcggcctcccaaagtgctgggattacagg 

cgtgaaccactgcgccctgttgagaatttttttttttttttttgggagaaagagtttcgctcttgttgcccgggctagag 

tgcagtgacacaatctcggctcactgcaacctctgcctcctgggttcaagcaattctcctgcctcagcctcatgcgtcac 

cacgcccagctaattttgtatttttagtagagacagggtttctccatgttggtcaggctggtctcgaactcccaacctca 

59t:ggttcgcccgccttggcctcccaaagtgctgggattgcaggcatgagccactgcgcccagccccaaattttggtttt 

tgcttgaaaactgaggtctgaattcagccttctggttgcccctcaagagtcagtttaaatgttggtcatgttagttgtca 

ctgaaaacaatggtgaggctggcatgagagtgtgaatctggatgggagggcttgtgcttcatgaaaacatttttccagat 

cagctcagtcgtgagttatccgtcattgacgttataataagctctgattatttatcaagcatcattctttatagatatct 

cagtttaatctgagataatcttctccacatctctccacatagatgttatgaattttacttttacagaggagccaactgag 

gctcagataagttacttattatatgactagtagtggtagagctggggtttcaactaagaactctctggctccaaagccct 

tgtaagtttctatcagtatatgaccatgcatatgagcatttgtctctcctcttcttcatagCTCCTTACTGCAATGATTT 

GATGAAGAATTTGGAGTCTAGTCCTCTTTCCCGCATTATCTGGAAAGCTCTGAAGCCGCTGCTCGTTGGGAAGATCCTGT 

ATACACCTGACACTCCAGCCACAAGGCAGGTCATGGCTGAGgtaagctgcccccagcccaagactccctccccagaatct 

ccccagaactgggggcaaaaaactcaaggtagcttcagaggtgtgcgctaagtatactcacggctcttctggaattccca 

gagtgaaaacctcaagtctgatgcagaccagagctgggccagctccccagtcgtgggtatagaatcatagttacaagcag 

gcatttcttggggatggggaggactggcacagggctgctgtgatggggtatcttttcagggaggagccaaacgctcattg 

tctgtgcttctcctcctttttctgcggtccctggctccccacctgactccagGTGAACAAGACCTTCCAGGAACTGGCTG 

TGTTCCATGATCTGGAAGGCATGTGGGAGGAACTCAGCCCCAAGATCTGGACCTTCATGGAGAACAGCCAAGAAATGGAC 

CTTGTCCGGgtgagtgtccctcccattattaccatgtgcctgcttgatactggagaggtgagtttctggtcactttccca 

ggtgtgagtgaggtgagaattctttcagtttatctagctgggggaatgtagtgagcatagctaaagtcacagggcaccac 

ctctccagaagtacaggccatggtgcagagataacgctgtgcatatcagcatccatgccactcacggtcaaatagcagtt 

ttctgcaaaacttagtgagggctggtgtttggaagtggagttgagtaattgcagtaccctattttcctttttgctgcagc 

ctctcagccagccacagcatctccctgtgtcttggtaggttttggaaagaagtgtgggagcaaaagcatgatgttacatg 

tagactggcctgagatactcattctcagggcactgtgtgaatgatgagctgctgttactgtgtggaggggaaatgcactt 

agtgcttcagagccacttgaaagggataagtgctctagagacaattgggttcaaatgtggagcaggctgagcaagaacag 

aatgtctcctttgcctgagcctgagtgctgttaatcacatcttcctgccttgggctgagttagagaatcattagactatt 

tcctgtttccatggtgagggaggcctcttccttttgtctctgctccccttaagaagcaggtgaggattttgccaggtttc 

ttgttttgaaccttattgactttaagggcggctgggttttagagactgtacctacctagggggaacacttccgaagttta 

ggactattccctgatccgctgggaggcaggttactgaggaagtccctttaaaaacaaaggagtttatactgagaaaagca 

taaacagtgatttgtatggattcacactgactaatatagctcatgccattaaagtggggtctcttctctaaaggagggtt 

atatgatctagccccgtagacctaagtgtggtttcagacctgttcttcctggtcctctccttggaatccatatttctact 

agttggactttttctgtttgtctggctctcagaggattataggaggccctgcgaagtgactcagtgaattttgatttgtg 

ggcaagt agatggttccctagtctgaaattgactttgccttaggtgcttcaattcttcataagctcccagttcttaaagg 

acaagatccttgtaaacatggcaatggcattcattaggaatctagctgggaaaatccagtgtgtatgcttggaaatgagg 

gat ctggggctggagagaaaggcatgggcatgccttggagggacttgtgtgtcaagctgaggacctt tact ttaagctct 

^999gaccaggcaaggggagatgtagatacgttactctgatggggtggatgaattgaagaaggatgaggcaagaatgaag 

gcagagaccagggaggaggctctccaagtggccaaggcataaagcaagaaatgaggcctggtgactgcttagtggcagag 

cagtgaaagagagggaggcatcaaagtgagtctcgatttctagctgggtgggtggtagcgatgtccagtaggccagtggc 

tactgaggtctgcagtggaggagggtggttgggctggagacagatgatgagggagtcatcagcctgtgggtggaagaaaa 

gggaacctcttccaactgttttctttgcttcttccctctctttctctttttttttttttttggacagagtcttgctctgt 

cacccaggctgaaatgcagtggcatgatcttggctcaccacagcctccgcctcctgggttcaagcaattctcctgtctca 

gcctccagagtagctgggattacaggcacatatcactgtgcccggctaatttttgtattttcagtggagatgggatttca 

ccatgttggtcgggctggaatgaactcctgacctcaagtgatccacctgcctcagcctcccaaagtgttgggattacagg 

catgagccaccgcgcccggcctttcttccctctcttaaagagtgtttatttaattccacaaacatgagcttgtcaccccc 

tgtagcctggcatctcctacacgaggtgatggctgagccttctgcttctgctggggtagctctgatctttctgctttctc 

tggcactgtctacccatgttccctcaccccacaggtcccaggccacctctctcgggcaagtcttggaaccctctgacact 

gat ttgctctcttttct gage tgcttttagccacccatcctcgggacctgttttctctctgcctccacccctgcgggcag 

tcttaggtctcctgcccctcacgagcaccccagagaggccacgtgctcagtgatctcagtgggcgcatctttctagtctt 

gctattctttttggccatgtcgttcagaaaccatactgggcagggccgacttcaccctaaaggctgcgtctcttcactct 

gcttttgtttgttccaaataaagtggcttcagaattgctaaccctagcctctgtgaacttgtgaggtacaattttgtgtc 

tgttatgttaacaaaaatacatacataccttcctggtgatggtataaattgctattctctattggaaagcaatttggaat 

gaaaatttaaagaaccattt^aaaatatgctatcctgcgtacctccattccacccacccccagggatgtagcctactgaa 

ataattttaaagaagtcaccatatgagagaaaatgttattgctatattgttattgtgagaaattggaaatagactaaatg 

ttcagcactataggaataattaatgaaattacatatactctatacaatcattatgctgccattgaaataataaatacaaa 

ggcgcaaggggggaaaagcttataatgttagtgaaactaagactgatttttttataaagcagcagttttcagacccttgg 
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aoactccaattcggtagaaccagagcttcatcttctctgtcgaagctgtgacaggacttgcaaatgcctctcctttttgc 
tgagtttgcagctgctgtttttccggcagcacatctgtgcaggcctctgcctcggcccctctcgatctgctgattgagca 
gcggattgatctgtccttctctttcgtgttgacccatgtgaggaaccaactggcaacggaacaagaaatggaaataggcc 
tcctttgcatcatgacctgtacatcctgcaattggaaaagattgtactttagttggtttaaccagcagcattatttttct 
aaactaagcagtaagaaggaattaggttttatgtgggatcaacagactgggtctcaaaagaggaaggtgatagaacacag 
tggggagggggaggtgcactagaaacagagggcctatgctttcattctggctttgctacttaatagctgtgtgacccaat 
cttagacacttaacctctctgaacttccattttctcatgtataaaatgggaaatattaaaggatactcactgggctggtg 
gcttgtgcctgtaatcccagcacttggggaggttgaggtgggaggatcacttgagcccaggtgttcaagaccagcccagg 
caacatggcaagactctgtctctatgaaaaaattaaaaattagccaggtgtggtggtgtgcacctgtagtcttagctact 
tggtagcctgagatgggaggatcacttgggcttgggaggtcaaggctgcggtgagctgtgattccatcactgcactccag 
cccgggcggcagagcgagacactgaatccaaacgacaacaacaacaaaaggcaaaaaaataaaagtgccctctttatgga 
gttgtgtaaggtgaagcatatacactattcaacatagtaactatataaaggaagtattgttgttgttactgtagttaata 
ccattaagtgagatgtttcgtatagtggaaagcacatggactctgaattcagactggtctgactttgagtctcagctcca 
catctagtaatactatgaccaagccctggttaaaatcatgtttttttttcttcagcctcagtcttctcacatataaaata 
gggacactgtcatttacctcagttttctgtgaggataaaacaacgacagtgtatatgcaagtattttgtaaattttgtag 
tgctcctcaagatttagttggtgtttactacttgtactttctcactggaatggcagATGCTGTTGGACAGCAGGGACAAT 
GACCACTTTTGGGAACAGCAGTTGGATGGCTTAGATTGGACAGCCCAAGACATCGTGGCGTTTTTGGCCAAGCACCCAGA 
GGATGTCCAGTCCAGTAATGGTTCTGTGTACACCTGGAGAGAAGCTTTCAACGAGACTAACCAGGCAATCCGGACCATAT 
CTCGCTTCATGGAGgtgaatctgttgctgggatcatttagaaaagacttaacggcttctttctctgagacgttacaataa 
ggttcaggcaggaggcaagtttagaaataatgtatagtctcatttacaaaactatccctcaagcctaacacaggatttga 
taacaaaaggcacttaataaatgttagttgagtggttgaatgagtaaataaactctagctttagtaaattaactctagct 
tattctatataggctcaagagaatatttctacccattttcttctaggttttcctatctcagtgactaatggtagcaaagc 
attcccttaaaaaggcattatttgtgaaacttayctaaaatcgaattcgggtccaattaaatttttgaaattttatatta 
aaaattatattagtagggatgggtaagaggtgttttggtctggttggttggttagttgctatgactcagaattgctaaga 
aaacagaaaagtaagataagatcattgttttaacctcttttcctccacaaaatcaataaataacatatccctaaattact 
cttagaatttctcttaaattgcagtgaaaaaccaaaatccttcattcttggttgaaggttggaaaactacgttagagagg 
attagagagagaggatgagcaatcgtgtagtcagcccttgcctcctagtgtaggatttgtctcagccactgcttgttgtc 
ctggctgccaacgttctcatgaaggctgttcttctatcagTGTGTCAACCTGAACAAGCTAGAACCCATAGCAACAGAAG 
TCTGGCTCATCAACAAGTCCATGGAGCTGCTGGATGAGAGGAAGTTCTGGGCTGGTATT6TGTTCACTGGAATTACTCCM 
EGCAGCATTGAGCTGCCCCATCATGTCA^GTACAAGATCCGAATGGACATTGACAATGTGGAGAGGACAAATAAAATCAA 
GGATGGgtaagtggaatcccatcacaccagcctggtcttggggaggtccagagcacctattatattaggacaagaggtac 
tttattttaactaaaaatttggtagaaatttcaacaacaacaaaaaaactcaacttggtgtcatgattttggtgaaattg 
gtacatgacttgctggaaggtttttcataggtcataaaataacagtatcttttgatttagcatttctactcaagggaatt 
aattccaggaattttggtggcaggcacctgtaatcccagctactcgggaggctgaggcaggagaattgcttgaacccagg 
aggcagaggttgcagtgagctaagatcgcatcattgcactcccgcctgggcaataagagtgaaactccatctcaaaaaaa 
aaaaaagatacaaaaatagaaaaaggggcttggtaagggtagtagggttttgggcaattttttttttttttttttttttt 
attgtatggttctaaaggaatggttgattacctgtggtttggctttagGTACTGGGACCCTGGTCCTCGAGCTGACCCCT 
TTGAGGACATGCGGTACGTCTGGGGGGGCTTCGCCTACTTGCAGGATGTGGTGGAGCAGGCAATCATCAGGGTGCTGACG 
GGCACCGAGAAGAAAACTGGTGTCTATATGCAACAGATGCCCTATCCCTGTTACGTTGATGACATgtaagttacctgcaa 
gccactgtttttaaccagtttatactgtgccagatgggggtgtatatatgtgtgtgcatgtgcatgcatgtgtgaatgat 
ctggaaataagatgccagatgtaagttgtcaacagttgcagccacatgacagacatagatatatgtgcacacactagtaa 
acctctttccttctcatccatggttgccacttttatctttttatttttatttttttttttgagatggagtctcgctctga 
cgcccaggctggagtgcagtggctcgatctcggctcactgcaacctttgcctcccgggttcaagctattctcctgcctca 
gcctccacagtagctgggactacaggctcatgctgccacgcccggctgactttttgtattttagtagagacgaggtttca 
ccatgttacccaggctagacttcaactcctgagctcaggcaatccaccctccttggcctcccaaagtgctgggattacag 
gtgtgagccactgcacccagcccaccactttaattttttacactctacccttttggtcaaaatttgctcaatctgcaagc 
ttaaaatgtgtcatgacaaacacatgcaagcacatactcacacatagatgcagaaacagcgtctaaacttataaaagcac 
agtttatgtaaatgtgtgcacttcttctccctaggtggtaaaccacatttcaaaacaacccaaataaaactgaacaaagc 
ttcttcctcttagactttttagaaaatctttcagtgctgagtcactaagctgccaagttctcattgrgggaactatgcct 
ttggatgtaatgatttcttctaagacaatgggcggaggtgtagttattgcagacatctgaaatatgtaatgtttcttcca 
gattctggaaattctcttattctctgtggttcgtggtggtggtgggatgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgt 
9tgtgtagggatcaggatgcggqaggagctgggttctgcttgtattggttctctgttttgcattgaatagtgtgtttcct 
tgtatggctatctatagcttttcaaggtcaccagaaattatcctgtttttcaccttctaaacaattagctggaatttttc 
aaaggaagacttttacaaagacccctaagctaaggtttactctagaaaggatgtcttaagacagggcacaggagttcaga 
ggcattaagagctggtgcctgttgtcatgtagtgagtatgtgcctacatggtaaagctttgacgtgaacctcaagttcag 
ggtccaaaatctgtgtgcctttttactttgcacatctgcattttctattctagcttggaatctgaaacattgacaagagc 
tgcctgaaatgtatgtctgtggtgtgattagagttacgataagcaagtcaatagtgagatgaccttggagatgttgaact 



Fig. 12 
Page 15 of 30 




tittgtgagacaatgagttgtttttttgttttggtttttagtactttaacataatctacctttagtttaagtatcgctcac 

agttacctacttactgaagcaagcccccaaagaaatttggtttggcaacactttgttagcctcgtttttctctctacatt 

gcattgctcgtgaagcattggatcatacgtacatttcagagtctagagggcctgtccttctgtggcccagatgtggtgct 

ccctctagcatgcaggctcagaggccttggcccatcaccctggctcacgtgtgtctttctttctccccttgtccttcctt 

ggggcctccacCTTTCTGCGGGTGATGAGCCGGTCAATGCCCCTCTTCATGACGCTGGCCTGGATTTACTCAGTGGCTGT 

GATCATCAAGGGCATCGTGTATGAGAAGGAGGCACGGCTGAAAGAGACCATGCGGATCATGGGCCTGGACAACAGCATCC 

TCTGGTTTAGCTGGTTCATTAGTAGCCTCATTCCTCTTCTTGTGAGCGCTGGCCTGCTAGTGGTCATCCTGAAGgtaagg 

cagcctcactcgctcttccctgccaggaaactccgaaatagctcaacacgggctaagggaggagaagaagaaaaaaaatc 

caagcctctcctagagaaggggtcatacctgtcatttcctgcaatttcatccatttatagttggggaaagtgaggcccag 

agaggggcactgacttgcccaaggtcaacccagccgggtagcagctaagtaggatgagagtgcagggttcatgctttcca 

gataaccacatgctcaactgtgccatgctgtctcattggtagtggttcatggcagcatctgaaagctatttattttctta 

gatatattgcgtggcgattcttcctaagtttctaagaacaataatcagaaggatatatattgttgcaggttagactgtct 

ggaagcagacgctgaaatagagtttgatgtatgggtatttatgagggctcaatacctatggaagagatatggaagatgca 

ggattgggcagagggaggagttgaactgtgatatagggccaaccccgtggggcactctagagaatatgcagcttgttgga 

gttgttcttcatcgagctgaaacatccagccctttgtgctcccccaaggcctccctcctgacaccacctacctcagccct 

ctcaatcaatcactggatgtgggctgccctgggaaggtcgtgccccagggcctacatggctctctgctgctgtgacaaac 

ccagagttgctgatgcctgaggccgtctactgacagctgggcaacaaggcttccctgaatggggactctgggcagtgcag 

ttttgtgtctgaaccatacattaatatatttatatccgaattttctttctctgcaagcatttcatataaagacacatcag 

gtaaaaataaatgtttttgaagcaaaaggagtacaaagagataagaactaactaatttaatactagttaccatctgttac 

aaatacttcctactgattgccaaggactgtttaaacacatcacatgggcttcttcttctatcctcactaacccttttaac 

agacaaggaaatgaggctcaggaaggtcaaggactttattgaggttccacagtaggatacagttcttgctaaaagcaacc 

cctccctcatgctctgttatctaactgcaaggogaaggtcagtggcagaggtagtggtcccatggttggtgcataagagc 

tgctctgagacaactgcatgctggtgggtcctgcagacatgtacccatcagccggagataggctcaaaatatccacaaga 

gtttggatgattgtgggaatgcagaatccatggtgatcaagagggaaagtcaagttgcctggccattttccttggctttt 

agacagaaaagttacgtgggatattatctcccacagctcttctgtggtgccaccagtcatagtccttatataaggagaaa 

ccagttgaaattacctattgaagaaacaaagagcaaactcgcccactgaaatgcgtagaaagccctggactctgttgtat 

tcataactctgccattatttttctgcgtagttttgggtaagtcacttatcttctttaggatggtaatgatcagttgcctc 

atcagaaagatgaacagcattacgcctctgcattgtctctaacatgagtaggaataaaccctgtcttttttctgtagatc 

atacaagtgagtgcttgggattgttgaggcagcacatttgatgtgtctcttccttcccagTTAGGAAACCTGCTGCCCTA 

CAGTGATCCCAGCGTGGTGTTTGTCTTCCTGTCCGTGTTTGCTGTGGTGACAATCCTGCAGTGCTTCCTGATTAGCACAC 

TCTTCTCCAGAGCCAACCTGGCAGCAGCCTGTGGGGGCATCATCTACTTCACGCTGTACCTGCCCTACGTCCTGTGTGTG 

GCATGGCAGGACTACGTGGGCTTCACACTCAAGATCTTCGCTgtgagtacctctggcctttcttcagtggctgtaggcat 

ttgaccttcctttggagtccctgaataaaagcagcaagttgagaacagaagatgattgtcttttccaatgggacatgaac 

cttagctctagattctaagctctttaagggtaagggcaagcattgtgttttattaaattgtttacctttagtcttctcag 

tgaatcctggttgaattgaattgaatggaatttttccgagagccagactgcatcttgaactgggctggggataaatggca 

ttgaggaatggcttcaggcaacagatgccatctctgccctttatctcccagctctgttggctatgttaagctcatgacaa 

accaaggccacaaatagaactgaaaactcttgatgtcagagatgacctctcttgtcttccttgtgtccagtatggtgttt 

tgcttgagtaatgttttctgaactaagcacaactgaggagcaggtgcctcatcccacaaattcctgacttggacacttcc 

ttcccccgtacagagcagggggatatcttggagagtgtgtgagcccctacaagtgcaagttgtcagatgtccccaggtca 

cttatcaggaaagctaagagtgactcataggatgctcctgttgcctcagtctgggcttcataggcatcagcagccccaaa 

caggcacctctgatcctgagccatccttggctgagcagggagcctcagaagactgtgggtatgcgcatgtgtgtggggga 

acaggattgctgagccttggggcatctttggaaacataaagttttaaaagttttatgcttcactgtatatgcatttctga 

aatgtttgtatataatgagtggttacaaatggaatcattttatatgttacttggtagcccaccactccctaaagggactc 

tataggtaaatactacttctgcaccttatgattgatccattttgcaaattcaaafcttctccaggtataatttacactaga 

agagatagaaaaatgagactgaccaggaaatggataggtgactttgcctgtttctcacagAGCCTGCTGTCTCCTGTGGC 

TTTTGGGTTTGGCTGTGAGTACTTTGCCCTTTTTGAGGAGCAGGGCATTGGAGTGCAGTGGGACAACCTGTTTGAGAGTC 

CTGTGGAGGAAGATGGCTTCAATCTCACCACTTCGGTCTCCATGATGCTGTTTGACACCTTCCTCTATGGGGTGATGACC 

TGGTACATTGAGGCTGTCTTTCCAGgtacactgctttgggcatctgtttggaaaatatgacttctagctgatgtcctttc 

tttgtgctagaatctctgcagtgcatgggcttccctgcgaagtggtttgggctatagatctatagtaaacagatagtcca 

aggacaggcagctgatgctgaaagtacaattgtcactacttgtacagcacttgtttcttgaaaactgtgtgccaggcagc 

atgcaaaatgttttatacacattgcttcatttaattctcacaaggctactctgaagtagttactataataaccagcaatt 

ttcaaatgagagaactgtgactcaaagacgttaagtaaccagctttggtcacacaactgttaaatgttggtacgtggagg 

tgaatccacttcggttacactgggtcaataagcccaggcgaatcctcccaatgctcacccaattctgtatttctgtgtcc 

tcagagggggtacaactaggagaggttctgtttcctgagtacaggttgttaataattaaatatactagctctaaggcctg 

cctgtgatttaattagcattcaataaaaattcatgttgaatttttctttagtacttctttcttaatataatacatcttct 

tgaccaagtccaagaggaacctgcgttggacagttttcatatgagatcaaattctgagagagcaagatttaacccttttt 

ggttcaccttctgatcctcccctaaggaggtatacatgaaatatttattactcctgcctgaacttctttcattgaatatg 
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caattttgcaccatgcagattctggatttaaattctgagtcttaacttactggctgagggaccttggataggctccttat 
ccctcagtttcctcatctctaaaatggggatggcacctgccccgtgggttgttggaaggacttacagaggtgcagaatct 
acgttgtacatagcaggtttcagcaaatgttagctccctctttccccacatccattcaaatctgttccttctccaaagga 
tgtgtcaaggaggaaatggacctggctgggaaaccctcagaatactgggatgatgctgagcttggctcatacctgtgctt 
tgctttcaoGCCAGTACGGAATTCCCAGGCCCTGGTATTTTCCTTGCACCAAGTCCTACTGGTTTGGCGAGGAAAGTGAT 
GAGAAGAGCCACCCTGGTTCCAACCAGAAGAGAATGTCAGAAAgtaagtgctgttgacctCCtgctctttCtttaaccta 
gtgctgctgcctctgctaactgttgggggcaagcgatgtctcctgcctttctaaaagactgtgaaaccactccaggggca 
gagaaatcacatgcagtgtccctttccaaatcctcccatgccatttatgtccaatgctgttgacctattgggagttcacg 
Gtctcgatccctgagggacattttctttgttgtcttggcttctagaagagtatcttttacttgccccctcccaaacacac 
atttcatggtctcctaacaagctagaagaaagaggtaaagacaagcgtgattgtggaaccatagcctcgctgcctgcctg 
tgacatggtgacctgtgtatcagcctgtgtgggctgagaccaagtggctaccacagagctcagcctatgcttcataatgt 
aatcattacccagatccctaatcctctcttggctcttaactgcagacagagatgtccacagctcatcaaaggctctgctt 
ctgggttctttgtgcttagagtggcttcctaaatatttaataggtcccttttctgccagtctcttctgtgcccatcccct 
gattgcccttggtaaaagtatgatgccccttagtgtagcacgcttgcctgctgttcctaatcatcttctcctacctcctc 
tttacacctagctcctgtttcagtcacctagaaatgctcacagtcgctggaatatgtcatgttcttccacacctccatgc 
ctttgtaggtactgtttgctctcacaggagaactttctctctaacttgcctatcttctcaactcctcctttctctccaag 
atctagttccggatcccctcccctgagcatccctccttggttctcaggtagtcagtcactctctgccctgaacttccatg 
gcacgtgaaagaaaatctttttattttaaaacaattacagactcacaagaagtaatacaaattacatgagggggttccct 
taaacctttcatccagtttccccaatggtagcagcatgtgtaactgtagaatagtatcaaaaccatgaaattgacatagg 
tacaattcacaaaccttcttcagatttcactagctttatgtgcgctcatttgtgtgtgtgtgtgcgtatttagttctatg 
caattttatcatgtgtgaattcatgtaattactagctcagtcaagctgcagaaatatctcattgtcacaaagctccttca 
tgctaccccttaatggccacagccacctcccttcttcctcagttcctgacacctgtcaaccactaatgcgttcctcgttt 
ttacagttttattatttctagaatgttacataaatggaaccatacagtaggtatccttttgatactggcttttttttttt 
tttcactcagcagtattcccttagatctatccaagttgtgtgtgtcaacagttcattcctcttcactgctgagtagtgtt 
ccctgggaggggtgtatcacagttccatggcatttttagatgtattttttaaacagctttcagcatcctctattttaatt 
gttcatcaagtcctttttcccaatagactctgaatgctcctttatcatcgtattcccatcaccaacatcagtacccaaat 
aggccctaaataaacatttatagcctcctgcctgcctgagaaaccagggtggacatggagagaaggcacttctgaaagtt 
caagcgcagtgcsctgtgtccttacactccactcctcagtgctttctgtgggttcatttctgtcttctctcctgtcacag 
TCTGCATGGAGGAGGAACCCACCCACTTGAAGCTGGGCGTGTCCATTCAGAACCTGGTAAAAGTCTACCGAGATGGGATG 
AAGGTGGCTGTCGATGGCCTGGCACTGAATTTTTATGAGGGCCAGATCACCTCCTTCCTGGGCCACAATGGAGCGGGGAA 
GACGACCACCATgtaagaagagggtgtggttcccgcagaatcagccacaggagggttctgcagtagagttagaaatttat 
accttaggaaaccatgctgatccctgggccaagggaaggagcacatgaggagttgccgaatgtgaacatgttatctaatc 
atgagtgtctttccacgtgctagtttgctagatgttatrtcttcagcctaaaacaagctggggcctcagatgacctttcc 
catgtagttcacagaattctgcagtggtcttggaacctgcagccacgaaaagatagattacatatgttggagggagttgg 
taattcccaggaactctgtctctaagcagatgtgagaagcacctgtgagacgcaatcaagctgggcagctggcttgattg 
ccttccctgcgacctcaaggaccttacagtgggtagtatcaggaggggtcaggggctgtaaagcaccagcgttagcctca 
gtggcttccagcacgattcctcaaccattctaaccattccaaagggtatatctttggggggtgacattcttttcctgttt 
tctttttaatctttttttaaaacatagaattaatatattatgagcttttcagaagatttttaaaaggcagtcagaaatcc 
tactacctaacacaaaaattgtttttatctttgaataatatgttcttgtttgtccattttccatgcatgcgatgttaggc 
at acaaaatacattttttaaagaa tact ttcattgcaaattggaaacttcgtttaaaaaatgc tea tact aaaattggca 
tttctaacccataggcccacttgtagttatttaccgaagcaaaaggacagctttgctttgtgtgggtctggtagggttca 
ttagaaaggaatgggggcggtgggagggttggtgttctgttctctctgcagactgaatggagcatctagagttaagggta 
ggtcaaccctgacttctgtacttctaaatttttgtcctcagGTCAATCCTGACCGGGTTGTTCCCCCCGACCTCGGGCAC 
CGCCTACATCCTGGGAAAAGACATTCGCTCTGAGATGAGCACCATCCGGCAGAACCTGGGGGTCTGTCCCCAGCATAACG 
TGCTGTTTGACATgtgagtaccagcagcacgttaagaataggccttttctggatgtgtgtgtgtcatgccatcatgggag 
gagtgggacttaagcattttactttgctgtgtttttgttttttctttttttcttttttatttttttgagatggagtctcg 
ctctgtagccaggctggactgtagtggcgcgatctcggctcactgcaaccttggcctcccaggttcaagcgattctcctg 
cctcagcctcccgagtagctgggactctaggcacacaccaccatgcccagctaatttttgtgtttttagtagagacgggg 
tttcaccatgttggccaggatggtctcaatgtcttgacctcgtgatccgcccacctcggtctcccaaagtgctgggaaca 
caggcatgagccactgtgtctggccacattttactttctttgaatatggcaggctcacctccgtgaacaccttgagacct 
agttgttctttgattttagcagaagtgggaggtgaatggttgagctgtagaggtgacatcagcccagccagtggatgggg 
gcttgggaaacattgcttcccattattgtcatgctggagggccctttagcccatcctctccccccgccaccctccttatt 
gaggcctggagcagacttcccagacctggtagtgcttcagggccctggtatgatggacctatatttgctgcttaagacat 
ttgctcccactcaggttgtcccatcagccataaggcccccagggagcccgtgtgatggagcagagagagacctgagctct 
gcaatcttgggcaaggcttttcccttatgtttcttcttatctaaagtgaacagctggggctcatgtgctccctcctcatc 
taaagtgaacacatggggctcatgtgcagggtcctccccgctttcagagcctgaggtcccctgaggctcaggaaggctgc 
tccaggtgagtgccgagctgacttcttggtggacgtgctgtggggacagcccattaaagaccacatcttggggccctgaa 
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attgaaagttgtaactgcctggtgcatggtqgccaggcctgctggaaacaggttgcaagcgatctgtcacctttcacttt 
gatttcctgagcagctcatgtggttgctcactgttgttctaccttgaatcttgaacattatttttcagaaattgataaag 
ttattttaaaaagcacggggagagaaaaatatgcccattctcatctgttctgggccaggggacactgtattctggggtat 
ccagtagggcccagagctgacctgcctccctgtccccagGCTGACTGTCGAAGAACACATCTGGTTCTATGCCCGCTTGA 
AAGGGCTCTCTGAGAAGCACGTGAAGGCGGAGATGGAGCAGATGGCCCTGGATGTTGGTTTGCCATCAAGCAAGCTGAAA 
AGCAAAACAAGCCAGCTGTCAGgtgcggcccagagctaccttccctatccctctcccctcctcctccggctacacacatg 
^^^sggaaaatcagcactgccccagggtcccagcctgggtgcggttggtaacagaaacttgtccctggctgtgcccctag 
gtcctctgccttcactcactgtctggggctggtcctggagtttgtcttgctctgtttttttgtagGTGGAATGCAGAGAA 
AGCTATCTGTGGCCTTGGCCTTTGTCGGGGGATCTAAGGTTGTCATTCTGGATGAACCCACAGCTGGTGTGGACCCTTAC 
TCCCGCAGGGGAATATGGGAGCTGCTGCTGAAATACCGACAAGgtgcctgatgtgtatttattctgagtaaatggactga 
cagagagcggggggcttttgagaagtgtggctgtatctcatggctaggcttctgtgaagccatgggatactcttctgtta 
kcacagaagagataaagggcattgagactgagattcctgagaggagatgctgtgtctttattcatctttttgtccccaac 
atggtgcactaaatttatggttagttgaaagggtggatgcttaaatgaatggaagcggagaggggcaggaagacgattgg 
gctctctggttagagatctgatgtggtacagtatgaggagcacaggcaggcttggagccaactctggcttggccctgaga 
cattgggaaagtcacaacttgcctcaccttctttgccgataataatagtggtgcgttacctcatagaggattaaattaaa 
tgagaatgcacacaaaccacctagcacaatgcctggcatatagcaagttcccaaataaaatgcgtactgttcttacctct 
gtgaggatgtggtacctatatatacaaagctttcccattctaggggtcatagccatacagggtgaaaggtggcttccagg 
tctcttccagtgcttacccctgctaatatctctctagtccctgtcactgtgacaaatcagaactgagaggcctcacctgt 
CCCacatCCttgtgtttgtgcctggcagGCCGCACCATTATTCTCTCTACACACCACATGGATGAAGCGGACGTCCTGGG 
GGACAGGATTGCCATCATCTCCCATGGGAAGCTGTGCTGTGTGGGCTCCTCCCTGTTTCTGAAGAACCAGCTGGGAACAG 
GCTACTACCTGACCTTGGTCAAGAAAGATGTG6AATCCTCCCTCAGTTCCTGCAGAAACAGTAGTAGCACTGTGTCATAC 
CTGAAAAAGgtgagctgcagtcttggagctgggctggtgttgggtctgggcagccaggacttgctggctgtgaatgattt 
ctccatctccaccccttttgccatgttgaaaccaccatctccctgctctgttgcccctttgaaatcatatcatacttaag 
gcatggaaagctaaggggccctctgctcccattgtgctacttctgttgaatcccgttttccttttcctatgaggcacana 
gagtgatggagaaggtccttagaggacattattstgtcsaagaaaagagacttgtcaagaggtaagagccttggctacaa 
atgacctggt cgt t cctgct cat tact tttcaatct cat tgaccttaacttttaaactataaaacagccaatattt at t a 
ggcactgatttcatgccagagacactctgggcattgaaagaaagtaatgataatagttaattttatatagcgttgttacc 
atttcaacctttttttttttttaacctctatcatctcaattaaag 
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SEQ ID NO: 22 



Genomic contig ccntainging ABCl exons 23 to 28: 

ctgaacacacattaaagcatgagaagcatgaactagacatgtagccaggtaaaggccttgctgagatggttggcaaaggc 
ctcattgcagcattcattggcaggccacagttcttttggcagctctgcttcctgacctttcaccctcaggaagcgaggct 
cttcacacggcacacacatgccagacagggtcctctgaagccacggctgccagtgcatgtgtcccagggaaagctttttc 
ctttagttctcacacaacagagcttcttggaagccctccccggcgaaggtgctggtggctctgccttgctccgtccctga 
cccgttctcacccccttctttgccatcagGAGGACAGTGTTTCTCAGAGCAGTTCTGATGCTGGCCTGGGCAGCGACCAT 
GAGAGTGACACGCTGACCATCGgtaaggactctggggtttcttattcaggtggtgcctgagcttcccccagctgagcaga 
gtggacgcacacgaggagacgtgcagaggctggtggcgctgactcaaggtttgctgctgggctggggctgggtggctgcg 
999gtgggagcagcttggtggcgggttggcctaatgcttgctggggtgcctggggctcggtttgggagctagcagggcag 
tgtcccagagagctgagatgattggggtttggggaatcccttaggggagtggacactgaataccagggatgaggagctga 
gggccaagccaggaggctgggatttgagcttagtacataagaagagtgagagcccaggagatgaggaacagccttccaga 
tttttcttgggtagcgtgtgtaggaggccagtgtcaccagtagcatatgtggaacagaagtcttgacccttgctatctct 
ccctagtcctaatggctggcttttcccaggaaggcttctgcttccatggactgttagattaaccctttatttaggtaaat 
gagggaacctactttataagcataggaaagggtgaagaatcttttaagattcctttactcaagttttcttttgaagaatc 
ccagagcttaggcaatagacaccagactttgagcctcagttatccattcacccatccacccacccacccacccatccttc 
catcctcccatcctcccattcacccatccacccatccagctgtccacccattctacactgagtacctataatgtgcctgg 
ctttggtgatacaaaggtgaataagacatagtcctttcctttgcccccaaccctcagaccagagatgaacatgtggaatg 
acctaaacacctggaacaggtgtggtgtatgagcggcaggcctctgatgagagggtgggggatggccagccctcactccg 
aagcccctctgagttgattgagccatctttgcattctggtcctgcagATGTCTCTGCTATCTCCAACCTCATCAGGAAGC 
ATGTGTCTGAAGCCCGGCTGGTGGAAGACATAGGGCATGAGCTGACCTATGTGCTGCCATATGAAGCTGCTAAGGAGGGA 
GCCTTTGTGGAACTCTTTCATGAGATTGATGACCGGCTCTCAGACCTGGGCATTTCTAGTTATGGCATCTCAGAGACGAC 
CCTGGAAGAAgtaagttaagtggctgactgtcggaatatatagcaaggccaaatgtcctaaggccagaccagtagcctgc 
^ttgggagcaggattatcatggagttagtcattgagtttttaggtcatcgacatctgattaatgttggccccagtgagcc 
atttaagatggtagtgggagatagcaggaaagaagtgttttcctctgtaccacagtacatgcctgagatttgtgtgttga 
aaccagtggtacctaacacatttacatcccaaccttaaactcctatgcacttatttaccctttaatgagcctctttactt 
aagtacagtgkgaggaacagcggcatcaggatcacttgggaacttgttagaaattcagcaacttgggcccagctcagacc 
tactgaatcagaatcaggagcaattctctggtgtgactgtgtcacagccaggtatcaactggattctcatacataggaaa 
tgacaaacgtttatggatggatagtctacttgtgccaggtgctgagatttgttttttgttttttgatttttttttaatca 
ctgtgacctcatttaattctcaaaaaaagatgaaaaaatgaacactcaggaatgctgacatgagattcagaatcaggggt 
ttggggcttcaaagtccatcctctctttatccatgtaatgcctccccttagagatacaacatcacagaccttgaaggctg 
aaggggatataaaagctgtctggccaagtggtctccaagcttgacagtgcagcagaatcacctggggatattattaaaaa 
taaacatactaaggtttggcttcagggcctgtgaatcagaatttctggaggtgaggccttgaagtctgtatttctattgc 
atactttggacacagtggtctatagactagagtttggaaatgattgcgctcattcagattctcttctgatgtttgaattg 
ctgccatcatatttctagtgctctatttcctcctgctcattctgtcttggataacttatcatagtactagcctactcaaa 
gatttagagccacagtcctgaaagaagccacttgactcattccctgtaggttcagaataaatttcttctgcgcagtgtct 
gtcatagctttttttaaatttttttttatttttgatgagactggagttttgctcttattgcccaagctggagtgcagtgg 
tgcgattttggctcactgcaacctccacctcccaggttcaagcgattctcctgcctcagcctcccaagtagctgagatta 
caagcatgtgctaccacgcccagctaattttgtatttttagtagagatgggttttatccatgttggtcaggctggtctcg 
agctccagacctcaggtgatctgcccgcctcggcctcccaaagtgctgggattataggcctgagccacagcgctcagcca 
taactttaatttgaaaatgattgtctagcttgatagctctcaccactgaggaaatgttctctggcaaaaacggcttctct 
cccaggtaactctgagaaagtgttattaagaaatgtggcttctactttctctgtcttacggggctaacatgccactcagt 
aatataataatcgtggcagtggtgactactctcgtaatgttggtgcttataatgttctcatctctctcattttccagATA 
TTCCTCAAGGTGGCCGAAGAGAGTGGGGTGGATGCTGAGACCTCAGgtaactgccttgagggagaatggcacacttaaga 
tagtgccttctgctggctttctcagtgcacgagtattgttcctttccctttgaattgttctattgcattctcatttgtag 
agtgcaggtttgttgcagatcgggaaggtttgttttgttgtaaataaaataaagtatgggattctttccttgtgccttca 
gATGGTACCTTGCCAGCAAGACGAAACAGGCGGGCCTTCGGGGACAAGCAGAGCTGTCTTCGCCCGTTCACTGAAGATGA 
TGCTGCTGATCCAAATGATTCTGACATAGACCCAGgtctgttagggcaagatcaaacagtgtcctactgtttgaatgtga 
aattctctctcatgctctcacctgttttctttggatggcctttagccaaggtgatagatccctacagagtccaaagagaa 
gtgacgaaatggtaaaagccacttgttctttgcagcatcgtgcatgtgatcaaacctgaaagagcctatccatatcactt 
cctttaaagacataaagatggtgcctcaatcctctgaacccatgtatttattatcttttctgcggggtcctagtttcttg 
tatacattaggtgtttaattgttgaacaaatattcattcgagtagatgagtgattttgaaagagtcagaaaggggaattt 
gctgttagagttaattgtaccctaagacttagatatttgaggctgggcatggtggctcatgccagtaatcccagcgcttt 
gagaggctgaggtgggtagatcacctgaggtcaggagtttgagaccagtctgaccaacaaggtgaaaccccgtctctact 
aaatacaaaaaattagccgagtgtggtggcacatgcctgtcatcccagctacttgggaggctgaggcaggagaatcgctt 
gaacccaggaggcagaggttgcagtcagccacggttgcgccattgcactccagactgggcaacaagagtgaaaactccat 
ctcaaaaaagaaaaaaaaagaattagatattttggatgagtgtgtctttgtgtgtttaactgagatggagaggagagcta 
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agacatcaaacaaatattgttaagatgtaaaagcacatcagttaggtat cat tact ttaggacaaggatttctagaaaat 
tztttaggaacagaaaactttccagttctctcacccctgctcaaagagtgtatggctcttacattatatataactgcccga 
cttcatacagtatcagtacttagatcatttgaaatgtgtccacgttttaccaaaatataatagggtgagaagctgagatg 
ctaattgccattgtgtattctcaaatatgtcaagctacgtacatggcctgtttcatagagtagtctataagaaattgatg 
act tgat teat ccgaatggctggctgtaacacctggttacgcatgaacacctcttttcagttgtctcaagacacctttct 
tttctgtacttatcagacaaggactgaaaggcagagactgctactgttagacattttgagtcaagcttttccttggacat 
agctttgtcatgaaagccctttacttctgagaaacttctagcttcagacacatgccttcaagatagttgttgaagacacc 
agaagaaggagcatgccaatgccgaaaacacctaagataataggtgaccttcagtgttggcttcttgcagAATCCAGAGA 
GACAGACTTGCTCAGTGGGATGGATGGCAAAGGGTCCTACCAGGTGAAAGGCTGGAAACTTACACAGCAACAGTTTGTGG 
CCCTTTTGTGGAAGAGACTGCTAATTGCCAGACGGAGTCGGAAAGGATTTTTTGCTCAGgtgagacgtgctgttttcgcc 
agagactctggcttcatgggtgggctgcaggctctgtgaccagtgaaggcaggatagcatcctggtcaagatatggatgc 
cggagccagatttatctgtatttcaatcccagttctattccttgccagttgtgtatccgctggcaagttacttctctatg 
cctcaatctcctcatctgtaaaatggggataataatattacctgcaatacagggttgttacgaaaataaaaatgaatagg 
tgcttagaatggggcctgacattagtaagtgcttagttttgtgtgtgtatatgttatttttattttggaggagaacataa 
aaaggacaaagtgtagaaaaactggttgggtgtattcagctgtcataacatgagagttgttatgcccagatgcacttgac 
atgtgaatttattagaaacatgatttttctctgagttgatgtttaactcaaactgatagaaaagataggtcagaatatag 
ttggccaacagagaagacttgttagactattgtctgcatgtcagtgtttgcatgctaacttgcttagttagaaaggttaa 
attttttcactctataaaatcaagaaatatagagaaaaggtctgcagagagtctttcatttgatgatgtggatattgtta 
agagcgggagtttggagcatacagagctcaagttgaatcctgactttgctacttattggctatatgaccttgggcaagct 
gcttagtctctctgatcctcagttacctttgtttgttgatgatgaccattgataacacaaccataaataatgacaacata 
gagatagttctcattatagtagttgttatacagaattattcactcaatgttaattttctgcattgaaatcccagaacatt 
agaattgggggcattatttgaatctttaaggttataaggaatacatttctcagcaataaatggaaggagttttgggttaa 
cttataaagtatacccaagtcattttttttcagagaagatatggtagaaagtcttaggaggttgaagaaggaattggata 
tttattctttctgagactatcatgggagataatgactatggttgtccatgattggagccgttgctgtagagttggtttta 
ttatagtgtaggatttgaatgggccatgtgttctcagacctcagaataaaaagagaaaactgaggccagtggggagcgtg 
acttcacatgggtacacttgtgctagagacagaaccaggattcaggacttctggctcctggtcctgggttcatggcccaa 
tgtagtctttctcagtcttcaggaggaggaagggcaggacccagtgttctgagtcaccctgaatgtgagcactatttact 
tcgtgaacttcttggcttagtgcctctgccaggtggccataacctctggccttgtgttgccagagaaaaggtttagtttt 
caggct coat tgcttcccagctgccaagaatgccttggtgcagcacagt cat aggccctgcattcctcattgccgtgctg 
gttggtcggggaggtgggctggactcgtagggatttgccccttggccttgtttctaacacttgccgtttcctgctgtccc 
CCtgcCCCCtccactgcctgggtaaagATTGTCTTGCCAGCTGTGTTTGTCTGCATTGCCCTTGTGTTCAGCCTGATCGT 
GCCACCCTTTGGCAAGTACCCCAGCCTGGAACTTCAGCCCTGGATGTACAACGAACAGTACACATTTGTCAGgtatgttt 
gtcttctacatcccaggagggggtaagattcgagcagaccaaagatgtttacgagggccaagggaatggacttcagaatt 
acacggtggaat 
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SEQ ID NO: 23 



Genomic contig containg ABCl exon 29: 

aggaaccatttaaaaaaaaaaaagtatatatatatatatatatatatatatgtaatgtgaattggcctctttttctctaa 

acccacattttcttcttacatagttcaggtttactttatttttccctttccggctgctgaccctgtattgcccgtagttg 

tggaacataccatgtgtttgtgacctgtgcctgttatttttgtgctttctagttgtgcatgcaaagagtacaaagttttc 

ttgccctttcttagaaaatcctgcttgtctgtgccaaagggataattgtgaaagcacttttgaaatacttaatgagttga 

ttttcttcaaattaaaaaaaatatataaatgtatatgtgtatgtacatgtgtgtacacatacacacctttatacatacag 

cccatttaaaacaagctccactttggagtgctctacgtcaccctgatgccgaatacagggccagagtctgagatccttct 

aggtgatttctgtgttttgttcatttctgttttaagagcctgtcacagagaaatgcttcctaaaatgtttaatttataaa 

lacatttttatctctcqattactggttttaatgaattactaagctggctgcctctcatgtacccacagCAATGATGCTCC 

TGAGGACACGGGAACCCTGGAACTCTTAAACGCCCTCACCAAAGACCCTGGCTTCGGGACCCGCTGTATGGAAGGAAACC 

CAATCCCgtcagtgccactttagccataagcaaggcttcttgtgcttgttgcctggtttgatttctaatatgctgcattt 

atcaactgcatgccacattgtgaccgccagcatttgccctttgaattattattatgttttatttacaaaaagcgaaggta 

gtaaccgaactaaattatctaggaacaaacgtttggagagtcttctaacaccgyscaaagcacgtcattacagacatttg 

tttactgatttagaaccttaatatttaatttaaatacgcactttacacttactgatgaaatgcttttcctttctttctct 

cccagcccctgtacttaagtgcttcaataggctctcattatatatgatttttaggttttgcttatcagcttcttcgcttt 

tataatctaaaaagatggcatatgaatttttataaaaagggacactttcttcttctcaaattgtatatttttattgtact 

ttccttcaaaacccccttttaaaaagtaagcagtggataaataaattcagtgaagcatccatatgacccttaagtgagtg 

tagggaaagcgaggtcaccagatcactgtgagtgaagatggtggagaggtgaggatcttatgaggccgtgctcaaggctg 

gtagaagtgcgttagtgtttccaggtttaggcagaatctcagctgaggtcatgaaacaacagtgatctctgaaaaattat 

ggcaaagtgagaaggtactggagaattggagagggggcaaacttgactttcaagtttcaatgggaagataggtgactctg 

cacaccacaaaacagtgagcatgataacctgtttatacaaggttctagagcagatttctaaatggatagctactgtgtgc 

ttgtttgttcttaattagtattggatagttactaaatacttgttagtacttagtacataatgggtggtaaatcctagcag 

ctaatattgattcccaaataaccagatgacaaggatagagaaggacacagacacggcctatctggatttcatggtgcctt 

tgattttccacatgaaggttgtgtagggaagatagaagcatgagatgagatgataatatagttatctggattcatcactg 

gccagctgaaccatatgaactcatggattgatgctagcttaggaaggctctgtaggagccagaactgggctgagagccag 

cccatagagacaaaagaggcccggccctgacatcagagggttcaaacatgatgtctgagccccacctacagtctgccgga 

ggtggttgaaaggaagagcctttatccttacaattcttactgaaattcaaatttttaggttttgcaaaaaaatggtggac 

^tgaaggaaatttgacaggagcatgtctcagctgtatttaaatttgtctcagccaatccccttttgaatgttcagagtgt 

aagcttcaggagggcagcgcatcttagtgtgacttttctggtcagttcaggtgctttaaggagacaattagagatcaatc 

tggaaaacttcatttgaatttttaatacataagaaaacaataagaaatagttaaaaatatatatttatataatatatata 

tgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtatatatatatatattttatttatttatttttttttgagatggagtctcg 

ctctgttgcccaggctggagtgcagtggctcaatcttggctcactgccacctctgcctcccaggttcaagtgattctcct 

acctcagcctcctgagtagctgggattacaagcatgtgccaccacactggctaa 
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SEQ ID NO: 24 



Genomic ccntig containing ABCl e:xons 3 0 and 31: 

tcttcccagtctctactcatttttcagcacatccagcataagatccagactctttcccaggcctctctcatctggctcct 
ctcctcctcctttatcat tact cttcttcgtacct tat cctactccagccatgctgtcttcctattattcctaaaaarta 
caaatgcatttcttcctagggcctttgtacctgcacttgccatcgcttttgctcagaatgttctttttgccaagcttttg 
cccagcttgttctccatcattgttatgttttggctgaaatgtcttctcttagtaggttcattctccccagtcactgtctt 
tttattttgctttattttgggccatctaaggttatcttattagtgtatttgttgttcgtctcctccatgggcatacacct 
ccatgaaggcaggtattttcaccttaggccctcgaatatactggacagcatctgccacgtagtagatgctcaacgaatgt 
ttgttgtgtgagcaaatggttggttgattggattgaactgagttcagtatgtaaatatttagggcctctttgcattctat 
tttacttatgtataaaatgatacataatgatgatataaatgatgtcacagtgtacaaggctgttgtgggatcaagcaatc 
aaatgagatcatgcttgtcttttccaaatggtgagggaatagatgcatgtttgtggttgttacggaatgatcctgtgctc 
ctgaggcaacagaaaggccaggccatctctggtaatcctactcttgctgtcttccctttgcagAGACACGCCCTGCCAGG 
CAGGGGAGGAAGAGTGGACCACTGCCCCAGTTCCCCAGACCATCATGGACCTCTTCCAGAATGGGAACTGGACAATGCAG 
AACCCTTCACCTGCATGCCAGTGTAGCAGCGACAAAATCAAGAAGATGCTGCCTGTGTGTCCCCCAGGGGCAGGGGGGCT 
GCCTCCTCCACAAgtgagtcactttcagggggtgattgggcagaaggggtgcaggatgggctggtagcttccgcttggaa 
gcaggaatgagtgagatatcatgttgggagggtctgtttcagtcttttttgttttttgtttttttttctgaggcggagtc 
ttgctctgtcgcccaggctggagtgctgtggcatgatcttgcctcactgcaacctccacctcccaggttcaagcgattct 
cctgcctcagcctcctgagtagctgggattacaggcacgcaccaccatgtctggctaatttttgtgtttttagtagagat 
agggtttcgccgtgttggctaggctggtctggaattcctgacctcaggtgatccacccgcctcggcctcccaaagtgctg 
ggattacaggcgtgagccactacgcccagccctgtttcagtctttaactcgcttcttgtcataagaaaaagcatgtgagt 
tttgaggggagaaggtttggaccacactgtgcccatgcctgtcccacagcagtaaagtcacaggacagactgtggcaggc 
ctggcttccaatcttggctctgcaacaaatgagctggtagcctttgacaggcctgggcctgtttcttcacctctgaatta 
gggaggctggaccagaaaactcctgtggatcttgtcaactctggtattcttagagactctgtttgggaaggagtcctgag 
ccattttttttttcttgagaatttcaggaagaggagtgcttatgatagctctctgctgcttttatcagcaaccaaattgc 
aggatgaggacaagcaattctaaatgagtacaggaactaaaagaaggcttggttaccactcttgaaaataatagctagtc 
caggtgcggggtggctcacacctgtaatctcagtattttgggatgccgaggtggactgatcacctaaggtcaggagttcg 
aaaccagcttggccaatgtggcgaaaccctgtctctactaaaaattcaaaaattagccaggcatggtggcacatgcctgt 
aatcccagttacttgggaggctgaagcaggagaattgcttgaacctgggaggtggaggtcgcagggagccaaaattgcgc 
cactgtactccagcctgagcaacacagcaaaactccatatcaaaaaataaaatgaataaaataacagctaatctagtcat 
cagtataactccagtgaacagaagatttattaggcatagtgaatgatggtgcttcctaaaaatctcttgactacaaagaa 
tctcatttcaatgtttattgtttagatgttcagaataaattcttgggaaagaccttggcttggtgtaagtgaattaccag 
tgccgagggcagggtgaaccaagtctcagtgctggttgactgagggcagtgtctgggacctgtagtcaggtttccggtca 
cactgtggacatggtcactgttgtccttgatttgttttctgtttcaattcttgtctataaagacccgtatgcttggtttt 
catgtgatgacagAGAAAACAAAACACTGCAGATATCCTTCAGGACCTGACAGGAAGAAACATTTCGGATTATCTGGTGA 

AGACGTATGTGCAGATCATAGCCAAAAGgtgactttttactaaacttggcccctgccttattattactaattagaggaat 
taaagacctacaaataacagactgaaacagtgggggaaatgccagattatggcctgattctgtctattggaagtttagga 
tattatcccaaactagaaaagatgacgagagggactgtgaacattcagttgtcagcttcaaggctgaggcagcctggtct 
agaatgaaaatagaaatggattcaacgtcaaattttgccac 
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SEQ ID NO: 25 



Genomic contig containing ABCl exon 32 

gcatgctggagtgatagtgaccatgagtttctaagaaagaagcataatttctccatatgtcatccacaattgaaatatta 
ttgttaattgaaaaagcttctaggccaggcacgctggctcatgcctgtaatcccagcactttaggagccaaggcgggtgg 
atcacttgaggtcaggagtttgagaccagcctggccaacatggggaaaccctgtctctactaaaaatacaaaataagctg 
ggcgtggtggtgcgtgcctgtaatcccagctacttgggaggctgaggcaggagaactgcttgaatctgggaggcggaggt 
tgcagtgagctgagttcatgccattgcattccagcctgggcaacaagagcgaaaccatctcccaaaagaaaaaaaaaaga 
aagaaaaagcttctagtttggttacatcttggtctataaggtggtttgtaaattggtttaacccaaggcctggttctcat 
ataagtaatagggtatttatgatggagagaaggctggaagaggcctgaacacaggcttcttttctctagcacaaccctac 
aaggccagctgattctagggttatttctgtccgttccttatatcctcaggtggatatttactccttttgcatcattagga 
ataggctcagtgctttctttgaactgattttttgtttctttgtctctgcagCTTAAAGAACAAGATCTGGGTGAATGAGT 
TTAGgtaagttgctgtctttctggcacgtttagctcagggggaggatggtgttgtaggtgtgcttggattgaagaaagcc 
^tggggattgtttgtcactcacacacttgtgggtgccatctcactgtgagga 
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SEQ ID NO: 26 



Genomic ccntig containing ABCl exons 33 to 36: 

gctttatagactttctgcctagagcatcatggctcagtgcccagcagcccctccaqaggcctctgaatatttgatatact 
gatttccttgaggagaatcagaaatctcctgcaggtgtctagggatttcaagtaagtagtgttgtgaggggaatacctac 
ttgtactttccccccaaaccagattcccgaggcttcttaaggactcaaggacaatttctaggcatttagcacgggactaa 
aaaggtcttagaggaaataagaagcgccaaaaccatctctttgcactgtatttcaacccatttgtccttctgggttttga 
aggaacaggtgggactggggacagaagagttcttgaagccagtttgtccatcatggaaaatgagataggtgatgtggcta 
cgtcagggggcccgaaggctccttgttactgatttccgtcttttctctctgccttttccccaagggccaggacccctgga 
tctctgggcagagcagacgcaggcccctataatagccctcatgctagaaaggagccggagcctgtgtataaggccagcgc 
agcctactctggacagtgcagggttcccactctcccaactccccatctgcttgcctccagacccacattcacacacgagc 
cactgggttggaggagcatctgtgagatgaaacaccattctttcctcaatgtctcagctatctaactgtgtgtgtaatca 
ggccaggtcctccctgctgggcagaaaccatgggagttaagagattgccaacatttattagaggaagctgacgtgtaact 
tctgaggcaaaatttagccctcctttgaacaggaatttgactcagtgaaccttgtacacactcgcactgagtctgctgct 
gatgatactgtgcaccccactgtctgggttttaatgtcaggctgttcttttagGTATGGCGGCTTTTCCCTGGGTGTCAG 
TAATACTCAAGCACTTCCTCCGAGTCAAGAAGTTAATGATGCCATCAAACAAATGAAGAAACACCTAAAGCTGGCCAAGg 
taaaatatctatcgtaagatgtatcagaaaaatgggcatgtagctgctgggatataggagtagttggcaggttaaacgga 
tcacctggcagctcattgttctgaatatgttggcatacagagccgtctttggcatttagcgatttgagccagacaaaact 
gaattacttagttgtacgtttaaaagtgtaggtcaaaaacaaatccagaggccaggagctgtggctcatgcctgtaatcc 
tagcactttgggaggctgaagcgggtggatcacttgaggtcaggagttcgagaccagcctggcctacatgacaaaacccc 
gtatctactaaaaatacaaaaaaattagctgggcttggtggcacacacctgtaatcccagctacttgggaggctgaggca 
r'---'ii ggagaattgcttgaaccctgtaggaagaggttgtagtgagccaagatcgcaccgttgcactccagcctgggcaacaagag 
['z, caaaac tccatctcaaaaaacaaattaaatccagagatttaaaagctctcagaggctgggcgcggtggcttacacctgtt 
atcccagcattttgggatgccgaggcgggcaaagcacaaggtcaggagtttgagaccagcctggccaacatagtgaaacc 
ill ctgtctctgctaaaaacatagaaaaattagccgggcatggtggcgtgcgcctgtaatcccagctactcgggaggctgagg 
ry tgagagaattrcttgaacccgggaggcggaggttgcagtgagcccagattgcaccactgcactccagcctgggcgacaga 
ijl gcaagactccatctcaaaaaaagctctcagaacaaccaggtttacaaatttggtcagttggtaaataaactgggtttcaa 
i-^^ acatactttgctgaaayaatcactgactaaataggaaatgaatctttttttttttttttttaagctggcaagctggtctg 
;n taggacctgataagtactcacttcatttctctgtgtctcaggtttcccatttttaggtgagaattaaggggctctgataa 
I ^1 aacagaccctaggattgtggacagcagtgatagtcctagagtccacaagtctgcttttgagtgatgggcccatgtatctg 
gcacatctgcaggcagagcgtggttctggctcttcagatgatgccggtggagcactttgaggagtcctcaccccaccgtg 
ataaccagacattaaaatcttggggctttgcatcccaggatttctctgtgattccttctagacttgtggcatcatggcag 
'rZ catcactgctgtagatttctagtcacttggttctcaggagccgtttatttaatggcttcacatttaatttcagtgaacaa 
ggtagtggcattgctcttcacagggccgtcctgttgtccacaggttccagattgactgttgccccttatctatgtgaaca 
gtcacaactgaggcaggtttctgttgtttacagGACAGTTCTGCAGATCGATTTCTCAACAGCTTGGGAAGATTTATGAC 
J1 AGGACTGGACACCAGAAATAATGTCAAGgtaaaccgctgtctttgttctagtagctttttgatgaacaataatccttatg 
p tttcctggagtactttcaactcatggtaaagttggcaggggcattcacaacagaaaagagcaaactattaactttaccag 
;3 tgaggcagtacggtgtagtgtagtgattcagagaatttgctttgccaccagacataccaggtaaccttgactaagttact 
taacctatctaaacctcagttycctcatctgtgaaatggagacagtaatcatagctatttccaaactgttgtgagaattc 
aatgagttaaaggtataaggtcctcaccacagcgcctgcccacatagtcagtgatcactatgtcctgaacactgtaatta 
cttcgccatattctctgatcatagtgttttgccttggtatgtgactagaatttctttctgaggtttatgggcatggttgg 
tgggtatgcacctgcctgcaggagcccggtttgggggcattaccttgtacctggtatgttttctttcagGTGTGGTTCAA 
TAACAAGGGCTGGCATGCAATCAGCTCTTTCCTGAATGTCATCAACAATGCCATTCTCCGGGCCAACCTGCAAAAG6GAG 
AGAACCCTAGCCATTATGGAATTACTGCTTTCAATCATCCCCTGAATCTCACCAAGCAGCAGCTCTCAGAGGTGGCTCTg 
taagtgtggctgtgtctgtatagatggagtggggcaagggagagggttatggagaaggggagaaaaatgtgaatctcatt 
gtaggggaacagctgcagagaccgttatattatgataaatctggattgatccaggctctgggcagaagtgataagtttac 
gaattggctggttgggcttcttgaactgcagaagagaaaatgacactgatatgtaaaaatcgtaacatttagtgaattca 
tataaagtgagttcaaaaattgttaattaaattataatttaattataagtgtttaatcagtttgatttgtttaaaaacca 
ctgttttaaatttggtggaatatgtttttattagcttgtatctttaattcctaaattaagctgtgtgtgtgtgtgtgtgt 
gtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgaagtttaaagccaggatgagctagtttaaagtatgcagcctttggagtc 
atacagatctgggtttgaatctggtctctaaactttatagatgtatgatattaaatgaggcagttcatgtaaattgccaa 
gcccagcactcagcacagagttgatatttcacacacattagatacctttcctgtatgtggagcatggcagttcctgtttc 
tgctttactcctacaggatactaatataggacactaggatctttataccaagaccccatgtaatgggcttatgagaccat 
tcttcttataaaaatctgacagaatttttgtatgtgttagatcaataggctgcatactgttattttcaagttgatttaca 
gccagaaatattaatttatttgagtagttacagagtaatatttctgctctcatttagttttcaagccccactagtccttt 
gtgtgtgaaaatttacaacttactgctcttacaaggtcatgaacagtggaccaaagtgaatgccattaaccactctgact 
tccttcattagttttattgtgacagtggactcttttgacctcagtaataccagtttggcatttacattgtcatattttta 
gacttaaaaatgatcatcttaaccctgaataaaatgtqtctggtgaacagatgtttttccttggctgtgcctcagatatc 
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tctgtgtgtgtgtacgtgtgtgtttgtctgtgtgtccatgtcctcactgattgagccctaactgcatcaaagacccctca 
gattttcacacgctttttCtCtCCagGATGACCACATCAGTGGATGTCCTTGTGTCCATCTGTGTCATCTTTGCAATGTC 
CTTCGTCCCAGCCAGCTTTGTCGTATTCCTGATCCAGGAGCGGGTCAGCAAAGCAAAACACCTGCAGTTCATCAGTGGAG 
TGAAGCCTGTCATCTACTGGCTCTCTAATTTTGTCTGGGATATGgtaaggacacaggcctgctgtatCtttCtgatgtct 
gtcagggccatggattgatatggataagaaagaaagagctctggctatcatcaggaaatgttccagctactctaaagatg 
tatgaaaaagaaatagccagaggcaggtgatcactttcatgacaccaaacacagcattgggtaccagagttcatgtcaca 
ccagagggaaaattctgtacacaatgatgaaaattaataccactaccacttaagttcctatgtgacaactttcccaagaa 
tcagagagatacaagtcaaaactccaagtcaatgcctctaacttctctgatgggttttaacctccagagtcagaatgttc 
tttgccttactaggaaagccatctgtcatttagaaaactctgtacattttatcagcagcttatccatccattgcaaatat 
tgtttttgtgccasccacaatatattgcttctatttggaccaatatgggggatttgaaggaattctgaagttctaattat: 
atttcaactctactttacaatatctccctgaaatatatctccctgtaacttctattaattataagctacacagagcaaat 
ctaattcttctcccaccgaacaagtccctggatatttaaaaataactctcatactctcatttaacctgagtattacccag 
ataagatgatatatgagaatacaccttgtaacctccgaagcactgtacaaatgtgagcaatgatggtggagatgatgatg 
agatctttgctgtttataccaagccccttagactgtgtcactcttctgatccggttgtccttgtatggccatgctgtata 
ttgtgaatgtcccgttttcaaaagcaaagccaagaattaaccttgtgttcaggctgtggtctgaatggttatgggtccag 
S999sgttgatctttagctcacacttctattactgcagcacaaagattttgcattttggaaggagcaccgtcttactggc 
aacttagtggtaaaccaaaacctccatttcacacaaatgattgtgaaattcgggtctccttcattctatacaaattcatt 
tgatttttttgaaactaaactttatatttatccatattaaattacatgggttttatttttgttttatcttgattcagtaa 
ttactcctttcagtaaacacagactgagtgctgtgtgtctgacttatgccaggcataggtgattcagagatgaaaggtca 
agtccctgaacccatctcttgtcttcctgggtattatctgtccctccctgctttagagctcctgaaatttgctagaagca 
tgtcttcatctaagttgttgataaacacatcaagtaggattggactgaggcagagccctgtagtctgaagctgcagttct 
tctagcggctgacaagccccactatcacttccctgctggtgctttgctctgccagctgtgaattctcataattgtcctat 
cgtcaagtctttatttctgcattttactgcttgatacactgtcaggacagactttaaaattattctcagtgcgatgaaac 
aattctgacattcatgttatgagcagttacctcataaatagattacatg 
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SEQ ID NO: 27 



Genomic contig containing ABCl exons 3 7 to 41: 

aaattactctgactgggaatccatcgttcagtaagtttactgagtgtgacaccttggcttgactgttggaaagacagaaa 
gggcatgtagtttataaaatcagccaaggggaaaatgcttgtcaaaatgtattgtcgggtattttgattaatagtttatg 
tggcttcattaattcagagttactctccaatatgtttatctgccctttcttgtctgataatggtgaaaacttgtgtgatg 
cattgtatatttgatttaggggtgaactggatgtctttgttttcacttttagTGCAATTACGTTGTCCCTGCCACACTGG 
TCATTATCATCTTCATCTGCTTCCAGCAGAAGTCCTATGTGTCCTCCACC^U^TCTGCCTGTGCTAGCCCTTCTACTTTTG 
CTGTATGGgtaagtcacctctgagtgagggagctgcacagtggataaggcatttggtgcccagtgtcagaaggagggcag 
ggactctcagtagacacttatctttttgtgtctcaacagGTGGTCAATCACACCTCTCATGTACCCAGCCTCCTTTGTGT 
TCAAGATCCCCAGCACAGCCTATGTGGTGCTCACCAGCGTGAACCTCTTCATTGGCATTAATGGCAGCGTGGCCACCTTT 
GTGCTGGAGCTGTTCACCGACAATgtgagtcatgcagagagaacactcctgctgggatgagcatctctgggagccagagg 
acagtgtttaattgtgatcttattccacttgtcagtggtattgacactgctgactgccttgtcctgtcttcagagtctgt 
cttccctgagaaggcaaagcacctttctttcttgctgtGccttacattttgctggtcaagcctttcagtttcttttgaca 
gttttttttacttctttcttttttcaatgttgctcttaccaagagtagctcctctgccttccactttacacatgagagct 
gggcgacgcattcagtcctaaggcttttaccatcacctctcttggtgtttttattgtcatctctaagatcaatgccttta 
gccttgatcataaccttgaactctaatctcaaattctcacttgcctagtggattgctccatttagatagtatatagatac 
cccaacctggatatgtcctagttttctttccccttggaacttaatgcttttcttgccatccctgtcacactcagtggcac 
taccatccactcggttgcccaagctggctcttagagttatcctagatgcttgctttgctgttgcagatttcccacattca 
actggttatgttgtcagttcttccaggtatggacctctaaaataaggcttcctctccattccggttgtcattgcctttgt 
ccaaacacagcacacaaggccttttacagttgcacaactcttcctgtccatacccaccacaccctttcccagctgtaagc 
ttcagatgagttgcctccaaccaccatgctcctgtaggcctggcttgaaatgcccttcttctgtcacagggtctggtagt 
atatcccttgcccttcaagatttagctaaaatgtgaagctttccttacctgctgggaggtgttctctcttttctctgtgc 
tctcagagtccttagtccatgcctccagtacaacgtacatccacttacatggtaatttcctgtttacatacttttcctac 
tcggagtggagtctgtttcttaataattttgcctctcccatgccctagcacagtgcatccagcgtatagccccttattca 
gttggtagatatttggccactgttgccttgtgggatcataagttctgatgtatttgagaagaatttctaaaattctgaca 
aaatcctgaaactcaaatattgacccagacatgagcaatttgcttttcaaatgctaagggatttttaatggatttgcttt 
aattaaatctagcctgtttctaagctttattcattatttctccatactcagagcatttctccagattttctaaagaatag 
aattttattgctacatatcatcagctatgcctgctgctatttaattggtatctgaattaaaaggtctggtttgtccctag 
agaatcaaattttttcttcactcccatatttcagaacttgatacatttttaggataaaccatgaatgacacccgtttctt 

CtCCCtCaCCCtCCCttCCCtCCCatttttttttttttttttttttagAAGCTGAATAATATCAATGATATCCTGAAGTC 
CGTGTTCTTGATCTTCCCACATTTTTGCCTGGGACGAGGGCTCATCGACATGGTGAAAAACCAGGCAATG6CTGATGCCC 
TGGAAAGGTTTGgtgagtgaagcagtggctgtaggatgctttaatggagatggcactctgcatagoccttggtaccctga 
actttgttttggaaagaagcaggtgactaagcacaggatgttcccccacccccatgcccagtgacagggctcatgccaac 
acagctggttgtggcatgggttttgtgacacaaccatttgtctgtgtcuctgacagcattgagaaaagtgaaagggcagt 
tttgaaggtaaggaaaatagtgttatttgcttggatccactggctcatgccactgtctgggttggttagaagcactggaa 
aagtcaaaccataactttgagaattaggtgatcagggaatcagaaggaaagatgcaaactttggctcttttaggcgaatc 
atgtgcctgcagatgaggtcatttattatcttttacacagtctataaaattataatgtattacatctttttctaccttta 
gaatggttaaaaatatttctccggtagccatatgattattattcatccattagataatatagtcaaatgggccatgttat 
ttactgttcatagaagaggggctttttgcaacttgggctacaaaggagatatgtaaggaatttaaggaatggttacatgg 
aactagatttaattgaatctagtggtttaattgattcactaggatatatgctactgaaaggggaatctgcttaaagtgct 
ttctgatatttattattactaaaacttagaatttattaaaaatactgactgtgaaaattacttgggtcgtttgccttttt 
aaaaggatttttggcatgtctcattaaaaaaagaaatactagatatcttcagtgaagttacaaatcgaatacacattggc 
tctgaaattctgattgatactgggtcataaaaagttttcccaaatcagacttggaaagtgatcactctcttgttactctt 
ttttccttgtcatgggtgatagccatttgtgtttattggaagatcggtgaattttaaggaacataggcccaaatttgagg 
aagggccatggtttttgatccctccattctgaccggatctctgcattgtgtctactagGGGAGAATCGCTTTGTGTCACC 
ATTATCTTGGGACTTGGTGGGACGAAACCTCTTCGCCATGGCCGTGGAAGGGGTGGTGTTCTTCCTCATTACTGTTCTGA 
TCCAGTACAGATTCTTCATCAGGCCCAGgtgagctttttcttagaacccgtggagcacctggttgagggtcacagaggag 
gcgcacagggaaacactcaccaatgggggttgcattgaactgaactcaaaatatgtgataaaactgattttcctgatgtg 
ggcatcccgcagccccctccctgcccatcctggagaccgtggcaagtaggttttataatactacgttagagactgaatct 
ttgtcctgaaaaatagtttgaaaggttcatttttcttgttttttcccccaagACCTGTAAATGCAAAGCTATCTCCTCTG 
AATGATGAAGATGAAGATGTGAGGCGGGAAAGACAGAGAATTCTTGATGGTGGAGGCCAGAATGACATCTTAGAAATCAA 
GGAGTTGACGAAGgtgagagagtacaggttacaatagctcatcttcagtttttttcagctttatgtgctgtaacccagca 
gtttgctgacttgcttaataaaagggcatgtgttcccaaaacgtacatctataccaaggttctgtcaattttattttaaa 
aacaccatggagacttcttaaagaattcttactgagaattcttttgtgatatgaattcccattctcgaatactttggttt 
tatatgcttacatttatgtgttagttattaaaacatactaatattgtatatctagtcaaactgagtagagagataatggt 
gatt 
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SEQ ID NO: 28 



Genomic contig containing exons 42 through 45: 

ttttaaaatacctgcaatacatatatatgttgaatagatgaaaaattatgtagatgataatgaatgatacggttctaaaa 
agacaggttaaaaactaagttcacttttattttgagcttcagaatcattcagaagccagtcgccacaaacgcagaccaag 
gctcttggcacatcaaatatgcctatggcttagggttattgacaagtcttatgttgcagtgtatgtggtttatagtcctg 
ccttccacagttgcttgggagagctgtgagtcactgagccttatgaatgtttacattttgtttgttgcagATATATAGAA 
GGAAGCGGAAGCCTGCTGTTGACAGGATTTGCGTGGGCATTCCTCCTGGTGAGgtaaagacactttgtctatattgcgtt 
tgtccctattagttcagactatctctacccaatcaagcaacgatgctcgttaagaggtaaaagtggattttaaaggcttc 
tgtatttatgccagcatggagcaattagtcatcgagaagagagggaccctgtatgtcaagagaatgatttcagagaatcc 
aatacaatttaagaaaaagcatggggctgggcgcagtgattcactcctgtaatcccagcactttgggaggccgaggtggg 
cggactcacgaggtcaggagattgagaccatcctggccaacatggtgaaaccccatctctactataaatacaaaaattag 
ctgggcatagtagtgcattcctgtagtcccagctactcgggaggctgaggcaggagaattgcttgaacctaggaggggga 
ggttgcccagattgcgctgctgcactccagcctggtgacagagtgagactcatgtcaacaacaaaaacagaaaaagcacg 
cacatctaaaacatgcttttgtgatccatttgggatggtgatgacattcaaatagttttttaaaaatagattttctcctt 
tctggtttccgtttgtgttcttttatgcccttttgccagagtaggtggtgcaatttggctagctggctttcattactgtt 
tttcacacattaactttggcctcaacttgacaactcaaataatatttataaatacagccacacttaaaatggtcccatta 
tgaaatacatatttaaatatctatacgatgtgttaaaaccaagaaaatatttgattcttctctgatatttaagaattgaa 
ggtttgaggtagttacgtgttaggggcatttatattcacgtttttagagtttgcttatacaacttaatctttccttttca 
gTGCTTTGGGCTCCTGGGAGTTAATGGGGCTGGAAAATCATCAACTTTCAAGATGTTAACAGGAGATACCACTGTTACCA 
GAGGAGATGCTTTCCTTAACAAAAATAGgtgagaaaagaagtgccttgtattttgctgcaaagactttgtttttaattta 
tttaaagaaataggttgttatttttgattacagtggtatttttagagttcataaaaatgttgaaatatagtaaagggtaa 
agaagcacataaaatcatccatgatttcaatatctagagataatcacaatttacatttcctttcagtctcattctcttct 
tttaacagctttattcaggtataatttacatacaatataatttgcttgttttttaagagtataatttagtgatttttggt 
aaattgagagttttgcaaccatcaccacaatccagttttagaacttttccatcaccccacatctgtcttatatacacata 
taaatgtgccatacaattgagatcatactgtatgtagaatttaaaattagtttttattgttaatgagtgtattatgaata 
tttcccagtgggttacatttcctaagatgtggaattttacattgctacataaaatccccctatgtacatgtacctataat 
ttatttaataaattccttataaatgttggacacattagtttccatttttcactatgtaaatatgtccctgtatacatctt 
ttattatttcctcaggaacaattcctacaaagtaaattgccctctctaaagagcatacaaattgactgagccaccgttag 
gccattttctgagactgcacaggtcacaaagcaatctgatctttgggaatacagctacattttataggcttcttagataa 
tgttactctaagtactttaaatatgtggggcttctctgggcttttttttttttgagacggagtttcactcttactgccca 
ggctggagagcaatggcgcgaccttggctcactgcaacctccgcctcccaggttcaagcgattctcctgcctcagcctcc 
tgagtagctgagattacaggtgcccgccacaatgcctgcctaatttttttgtattttcagtagagatggggtttcaccat 
gttggccagactggtctcgagctcctgacctcaggtgatccacctgcctcagcctcccaaagttctgggattacaggcat 
gagccactgcgcccggcttctctggacttattatgtggagagatagtacaaggcagtggctttcagagttttttgaccat 
gaccgttgtgggaaatacattttatatctcaacctagtatgtacacacagacatgtagacacatgtataacctaaagttt 
cataaagcagtacctactgttactaattgtagtgcactctgctatttcttattctaccttatactgcgtcattaaaaaag 
tgctggtcatgacccactaaatttatttcccaaaccactaatgaacaatgactcacaatttgaacacactggacaggggg 
atagccaataaaattgaaaagagcaaggaaattaatgtattcatgatctcctctcctgtctcttacatttttgcagtagc 
aatgtaaaggaatcctaagagaacagacattctgggaatagcaggcctagcgctgcacaactgctttcctaggcttgctc 
ctagtaccaagctcctgacgcatatagcagtggcagtaataaccagcccatagtaaggtttgtcacagggactggttgta 
agaactgatttgrttggtatagctgtgagggcctggcacggtgtccacgtgtgcctcaatcctaattctgaaaaaggctg 
accctgggggtgctaattagatacacagagaggaatgaatgctgccagaaggccaagttcatggcaatgccgctgtggct 
gaggtgcagtcatcagtctggaacgtgaacactgaacttctctcacatgtgattcttcacttgactggcttcatagaacc 
ccaaagccaccccaccaccacataaattgtgtctctaggttctgtgttgctcacactcaaaatttctgggccttctcatt 
tggtgcatgtgaatggtgcatatgagtgaagtctaggatggggccttagcgttaaagccctggggtagtgtgactgagat 
tgttggtaaagaatgtgcagtggttggcatgacctcagaaattctgaaatgggactgcacctgcagactgaagtgttcag 
agagccagggaggtgcaaggactggggagqgtagaggcaggaaccctgcctgccaggaagagctagcatcctgggggcag 
aaaggctgtgctttcaagtagcagcagatgtattggtatctttgtaatggagaagcatactttacaggaacattaggcca 
gattgtctaaccagagtatctctacctgcttaaaatctaagtagttttcttgtcctttgcagTATCTTATCAAACATCCA 
TGAAGTACATCAGAACATGGGCTACTGCCCTCAGTTTGATGCCATCACAGAGCTGTTGACTGGGAGAGAACACGTGGAGT 
TCTTTGCCCTTTTGAGAGGAGTCCCAGAGAAAGAAGTTGGCAAGgtactgtgggcacctgaaagccagcctgtCtCCttt 
ggcatcctgacaatatataccttatggcttttccacacgcattgacttcaggctgtttttcctcatgaatgcagcagcac 
aaaatgctggttctttgtatctgctttcagggtggaaacctgtaacggtggtggggcagggctgggtgggcagagaggga 
gtgctgctcccaccacacgagtcccttctccctgctttggctcctcaccagttgtcaggttatgattatagaatctagtc 
ctactcagtgaaagaactttcatacatgtatgtgtacgacagcatgataaaattcccaagccagaccaaagtcaaggtgc 
tttttatcactgtaoGTTGGTGAGTGGGCGATTCGGAAACTGGGCCTCGTGAAGTATGGAGAAAAATATGCTGGTAACTA 
TAGTGGAGGCAACAAACGCAAGCTCTCTACAGCCATGGCTTTGATCGGCGGGCCTCCTGTGGTGTTTCTGgtgagtataa 
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c'tgtggatggaaaactgttgttctggcctgagtggaaaacatgactgttcaaaagtcctatatgtccagggctgttgtat 
gattggcttgtcttcccccagggacagcagagcaaccttggaaaagcagagggaagcttctcccttggcacacactgggg 
tggctgtaccatgcctgcagatgctcccaaatagaggcactccaagcactttgtttcttagcgtgattgaggctggatat 
gtgatttgatctttctctggaacattctttctaatcatctttgtgttcattccctgaaaatgaagagtgtggacacagct 
ttaaaatccccaaggtagcaactaggtcatagttccttacacacggatagatgaaaaacagatcagactgggaagtgccc 
cttgaccttttttcctctgtagataagagcattgatgttattacgggaagaagcctttgaggcttttatgtattccacct 
cggtctggaatttgtttctgtaaggctaacagttgcaatatactagggtaatctgagtgagctggaattaaaaaaaaaaa 
ggaatttcaccccaatcttatactgacttcaatagaggtttcagacaaaaagttgttttgtat 
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SEQ ID NO: 29 



Genomic contig containing ABCl exons 46 to 49: 

ngccnngttnaaaangaaaatttnnnnnaaattnaannttannggngnnntttccccagaaaaaacnaaaangatttccn 
cccnggggggncccccnantcnaaaaggccccncttntttgnggngagggaaagntttttttggaatttttaatttttgg 
tcccccaaaacctattattgagaatttaattacataaaaaagtactcagaatatttgagtttcctgcatcaataagacat 
ttataataatgaccttgtttacaaatgaatttgaaagttactctaattctttgattcatcaagaaataactagaatggca 
agttaaaatttaagctgtttcaaagatgcttctgcatttaaaaacaaatttatctttgattttttttccccccagcaaat 
aagacttattttattctaattacagGATGAACCCACCACAGGCATGGATCCCAAAGCCCGGCGGTTCTTGTGGAATTGTG 
CCCTAAGTGTTGTCAAGGAGGGGAGATCAGTAGTGCTTACATCTCATAGgtccgtagtaaagtcttgggttcctcactgt 
gggatgttttaactttccaagtagaatatgcgatcattttgtaaaaattagaaaatacagaaaagcaaagagtaaaacaa 
ttattacctgaaattatatatgcatattcttacaaaaatgcaagcccagtataaatactgctctttttcacttaatatat 
tgtaaacattattccaagtcagtgcatttaggtgtcatttcttatagctggatagtattccattaggatatactcttatt 
taactattcccccttttgtagacatttggattatttccaacttgttcacaattgtaaacaccactacactgaacagcatc 
atccctatatccacatgtacttgtaacagaatacaattccctaggaagctggaatgctggaagtcatggtgatgttctca 
tggttacagagaatctctctaaaactaaaacctctttctgttttaccgcagTATGGAAGAATGTGAAGCTCTTTGCACTA 
GGATGGCAATCATGGTCAATGGAAGGTTCAGGTGCCTTGGCAGTGTCCAGCATCTAAAAAATAGgtaataaagataattt 
ctttgggatagtgcctagtgagaaggcttgatatttattcttttgtgagtatataaatggtgcctctaaaataaagggaa 
ataaaactgagcaaaacagtatagtggaaagaatgagggctttgaagtccgaactgcattcaaattctgtctttaccatt 
tactggttctgtgactcttgggcaagttacttaactactgtaagagttagtttccctggaagatctacctcctagctttg 
tgctatagatgaaatgaaaaaaatttacatgtcccagtactggtgagagcgcaagctttggagtcaaacacaaatgggtt 
tgcatcctggccctaccaat tat gage tctgagccatgggcaagtgactaactccctgggcctcagtttctctgtaacat 
ctgtcagacttcatgggtccaggtgaggattaaaggagatcatgtatttacagcacatggcatggtgcttcacataaaat 
aagtatttagtaaatgataactggttccttctctcagaaacttatttctgggcctgccaggggccgccctttttcatggc 
acaagttgggttcccagggttcagtattcttttaaatagttttctggagatcctccatttgggtattttttcctgctttc 
agGTTTGGAGATGGTTATACAATAGTTGTACGAATAGCAGGGTCCAACCCGGACCTGAAGCCTGTCCAGGATTTCTTTGG 
ACTTGCATTTCCTGGAAGTGTTCYAAAAGAGAAACACCGGAACATGCTACAATACCAGCTTCCATCTTCATTATCTTCTC 
TGGCCAGGATATTCAGCATCCTCTCCCAGAGCAAAAAGCGACTCCACATAGAAGACTACTCTGTTTCTCAGACAACACTT 
GACCAAgtaagctttgagtgtcaaaacagatttacttctcagggtgtggattcctgccccgacactcccgcccataggtc 
caagagcagtttgtatcttgaattggtgcttgaattcctgatctactattcctagctatgctttttactaaacctctctg 
aacctgaaaagggagatgatgcctatgtactctataggattattgtgagaatttactgtaataataaccataaaaactac 
catttagtgagcacctaccatgggccaggcattttacttggtgcctaatcctatttaaattagataaaaaagtaccaaat 
aggtcctgacacttaagaagtactcagtaaatattttcttccctcttccctttaatcaagaccgtatgtgccaaagtaaa 
tggatgactgagcagttggtgatgtaggggtggggggcgatatagaaagtcagtttttggccgggcgtggtggctcatgc 
ctgtaatcccagcactttgggaggctgaggagcaggcagatcatgaggtcaggagatccagataatcctggccaacaggg 
tgaaaccccgtctctactaaaaatacaaaaattagctgggcatggtggtgcgcacttgtagtcccagctacttgcgaggc 
tgaggcaggagaattgctcgaacccaggaggtggaggttacagtgagccaaggtctcgccactgcactccagcctgggga 
cagagcaagaccccatttcaaggggggaaaaaaagtctatttttaagttgttattgcttttttcaagtattcttccctcc 
ttcacacacagttttctagttaatccatttatgtaattctgtatgctcctacttgacctaatttcaacatctggaaaaat 
agaactagaataaagaatgagcaagttgagtggtatttataaaggtccatcttaatcttttaacagGTATTTGTGAACTT 
TGCCAAGGACCAAAGTGATGATGACCACTTAAAAGACCTCTCATTACACAAAAACCAGACAGTAGTGGACGTTGCAGTTC 
TCACATCTTTTCTACAGGATGAGAAAGTGAAAGAAAGCTATGTATGAAGAATCCTGTTCATACGGGGTGGCTGAAAGTAA 
AGAGGAACTAGACTTTCCTTTGCACCATGTGAAGTGTTGTGGAGAAAAGAGCCAGAAGTTGATGTGGGAAGAAGTAAACT 
GGATACTGTACTGATACTATTCAATGCAATGCAATTCAATGcaatgaaaacaaaattccattacaggggcagtgcctttg 
tagcctatgtcttgtatggctctcaagtgaaagacttgaatttagttttttacctatacctatgtgaaactctattatgg 
aacccaatggacatatgggtttgaactcacacttttttttttttttttgttcctgtgtattctcattggggttgcaacaa 
taattcatcaagtaatcatggccagcgattattgatcaaaatcaaaaggtaatgcacatcctcattcactaagccatgcc 
atgcccaggagactggtttcccggtgacacatccattgctggcaatgagtgtgccagagttattagtgccaagtttttca 
gaaagtttgaagcaccatggtgtgtcatgctcacttttgtgaaagctgctctgctcagagtctatcaacattgaatatca 
gttgacagaatggtgccatgcgtggctaacatcctgctttgattccctctgataagctgttctggtggcagtaacatgca 
acaaaaatgtgggtgtctccaggcacgggaaacttggttccattgttatattgtcctatgcttcgagccatgggtctaca 
gggtcatccttatgagactcttaaatatacttagatcctggtaagaggcaaagaatcaacagccaaactgctggggctgc 
aactgctgaagccagggcatgggattaaagagattgtgcgttcaaacctagggaagcctgtgcccatttgtcctgactgt 
ctgctaacatggtacactgcatctcaagatgtttatctgacacaagtgtattatttctggctttttgaattaatctagaa 
aatgaaaagatggagttgtattttgacaaaaatgtttgtactttttaatgttatttggaattttaagttctatcagtgac 
ttctgaatccttagaatggcctctttgtagaaccctgtggtatagaggagtatggccactgcccactatttttattttct 
tatgtaagtttgcatatcagtcatgactagtgcctacaaagcaatgtgatggtcaggatctcatgacattatatttgagt 
ttctttcagatcatttaggatactcttaatctcacttcatcaatcaaatattttttcagtgtatgctgtagctgaaagag 
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tatgtacgtacgtataagactagagagatattaagtctcagtacacttcctgtgccatgttattcagctcactggtttac 
aaatataggttgtcttgtggttgtaggagcccactgtaacaatactgggcagcctttttttttttttttttaattgcaac 
aatgcaaaagccaagaaagtttaagggtcacaagtctaaacaatgaattcttcaacagggaaaacagctagcttgaaaac 
ttgctgaaaaacacaacttgtgtttatggcatttagtaccttcaaataattggctttgcagatattggataccccattaa 
atctgacagtctcaaatttttcatctcttcaatcactagtcaagaaaaaatataaaaacaacaaatacttccatatggag 
catttttcagagttttctaacccagtcttatttttctagtcagtaaacatttgtaaaaatactgtttcactaatacttac 
tgttaactgtcttgaqagaaaagaaaaatatgagagaactattgtttggggaagttcaagtgatctttcaatatcattac 
taacttcttccactttttccagaatttgaatattaacgctaaaggcgtaagacttcagatttcaaattaatctttctata 
ttttttaaatttacagaatattatataacccactgctgaaaaagaaacaaatgattgttttagaagttaaaggtcaatat 
tgattttaaaatattaag 
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No. Name Location in SEQ ID No. 14 Seauence Sequence Strand 



Length 



1 PPRE 


58-69 


AGGTAAAAGTCA 


12 Complement 


2 PPRE 


1997-2009 


AGAGTAGAGGGCA 


13 Lead 


3 PPRE 


2150-2161 


ATGTCAAGTTCA 


12 Lead 


4 PPRE 


2156-2169 , 


AGTTCAAAAGOGCA 


14 Lead 


5 PPRE 


4126-i139 AGGCCAGCAGGGCC 


14 Complement 


6 PPRE 


5075-5087 


AGGGCAGAAGTGA 


13 Lead 


7 PPRE 


660d-5615 


ATGCCAAGGTCA 


12 Complement 


8 PPRE 


6731-6743 


G6GGCAAGGGTA 


13 Comolement 


9 PPRE 


7220-7233 , 


AGGTAATGAGGACA 


14 Comolement 


10 PPRE 


7554-7568 GGATCACGAGGTCA 


15 Complement 


1 SRE 


159-166 


CAGCCCAT 


8 Lead 


2 SRE 


1133-1140 


CAGCTCAC 


3 Complement 


3 SRE 


1145-1152 


CACACCAC 


8 Lead 


4 SRE 


1809-TS16 


CAGCCCTC 


8 Comptement 


5 SRE 


1894-1901 


CAGCCCAT 


8 Lead 


6 SRE 


2563-2570 


CAACCCAC 


8 Lead 


7 SRE 


3303-3310 


CAGCTCAC 


8 Lead 


8 SRE 


3470-3477 


CCGCCCAC 


8 Lead 


9 SRE 


4764-4791 


CTCCCCAC 


8 Complement 


10 SRE 


4802-4809 


CAGCCTAC 


8 Complement 


11 SRE 


4970-4977 


CAGCTCAC 


8 Complement 


12 SRE 


6487-6494 


CAGCCTAC 


8 Complement 


13 SRE 


6565-6572 


CACCCAAC 


8 Complement 


14 SRE 


6727-6734 


CAGCCTAC 


a Lead 


15 SRE 


7041-7048 


CACCCAAC 


8 Lead 


16 SRE 


8059-8066 


CAGCCCTC 


3 Complement 


1 ROR{retinoic acd receptor related) 


166-172 


AGGGTCA 


7 Complement 


2 ROR(ret)noic aod receptor related) 


156-173 


AAGGGTCA 


8 Complement 


3 RORfretinoic acd receptor related) 


363-370 


ATGGGTCA 


8 Lead 


4 ROR(retino!C aad receptor related) 


364-370 


TGGGTCA 


7 Lead 


5 ROR(retinotc aod receptor related) 


2218-2225 


TAGGGTCA 


8 Lead 


6 ROR(rettnoic acd receptor related) 


2219-2225 


AGGGTCA 


7 Lead 


7 ROR(rettnoic acd receptor related) 


3643-3649 


TGGGTCA 


7 Lead 


8 ROR(rettnoic actd receptor related) 


6604-6610 


AAGGTCA 


7 Complement 


1 SREBP-1 or "E box" 


473-479 


ACACCTG 


7 Complement 


2 SREBP-1 or "E oox" 


536-541 


ACACATG 


7 Lead 


3 SREBP-1 or "E box" 


537-543 


TCATGTG 


7 Complement 


4 SREBP-1 or 'E box" 


655-661 


TCATGTG 


7 Complement 


5 SREBP-1 or "E box" 


925-931 


ACACTTG 


7 Lead 


6 SREBP-1 or "E box" 


957-973 


TCACTTG 


7 Lead 


7 SREBP-1 or "E box" 


968-974 


TCAAGTG 


7 Complement 


8 SREBP-1 or "E oox" 


1063-1069 


ACAGGTG 


7 Complement 


9 SRE BP- 1 or "E box" 


1104-1110 


TCACTTG 


7 Lead 


10 SREBP-1 or"E box" 


1105-1111 


TCAAGTG 


7 Comptement 


1 1 SREBP-1 or "E box" 


1551-1567 


TCACTTG 


7 Lead 


12 SREBP-1 or "E box" 


1670-1676 


TCAAATG 


7 Lead 


13 SREBP-1 or"E box" 


1748-1754 


ACACTTG 


7 Lead 


14 SREBP-1 or 'E box" 


1749-1755 


ACAAGTG 


7 Complement 


15 SREBP-1 or "E box" 


1852-1858 


TCATGTG 


7 Lead 


16 SREBP-1 or "E box" 


1853-1859 


ACACATG 


7 Complement 


17 SREBP-1 or'Ebox" 


1899-1905 


ACAAATG 


7 Complement 


18 SREBP-1 or"E box" 


2199-2205 


ACACGTG 


7 Lead 


19 SREBP-1 or "E box" 


2393-2399 


ACAGCTG 


7 Complement 


20 SREBP-1 or "E box" 


2559-27005 


ACACCTG 


7 Lead 


21 SREBP-1 or "E box" 


2577-2683 


TCACATG 


7 Complement 


22 SREBP-1 or "E box" 


2740-2746 


ACAACTG 


7 Comptement 


23 SREBP-1 or "E box" 


2369-2975 


ACAAATG 


7 Lead 


24 SREBP-1 or "E box" 


2979-2985 


ACACATG 


7 Lead 


25 SREBP-1 or "E box" 


2981-2987 


ACATGTG 


7 Lead 


26 SREBP-1 or "E box" 


2980-2986 


ACATGTG 


7 Complement 


27 SREBP-1 or "E box" 


2982-2988 


ACACATG 


7 Complement 


28 SREBP-1 or "E box" 


3461-3467 


TCAGGTG 


7 Lead 


29 SREBP-1 or "E box" 


3462-3468 


TCACCTG 


7 Complement 


30 SREBP-1 or "E box" 


3547-3553 


TCAACTG 


7 Complement 


31 SREBP-1 or "E box" 


3752-3758 


ACACATG 


7 Lead 


32 SREBP-1 or "E box" 


4226-4232 


TCACCTG 


7 Lead 




4562-4588 


ACACGTG 


7 Complement 


34 SREBP-1 or 'E box" 


4588-4594 


TCAGTTG 


7 Lead 


35 SREBP-1 or "E box" 


4861-4857 


TCAGGTG 


7 Lead 


36 SREBP-1 or "E box" 


4951-4957 


ACAAATG 


7 Lead 


37 SREBP-1 or "E box" 


5096-5102 


TCAAATG 


7 Complement 


38 SREBP-1 or "Ebox" 


5912-5918 


ACAGTTG 


7 Lead 


39 SREBP-1 or -Ebox" 


5913-5919 


TCAACTG 


7 Complement 


40 SREBP-I or "E box" 


6245-6251 


ACACATG 


7 Complement 


41 SREBP-1 or "Ebox" 


6288-6294 


ACAACTG 


7 Complement 


42 SREBP-1 or "E box" 


6623-5629 


TCATTTG 


7 Lead 


43 SREBP-1 or "E box" 


S836-6842 


TCACCTG 


7 Lead 


44 SREBP-1 or "E box" 


6837-6843 


ACAGGTG 


7 Complement 


45 SREBP-1 or "E box" 


7032-7038 


ACAGGTG 


7 Complement 
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46 SREBP-'J or "E box" 


7059-7075 


TCAGGTG 


7 Lead 


47 SREBP-1 or"EtX3x" 


7101-7107 


ACATATG 


7 Complement 


48 SREBP- 1 or "E box" 


7138-7144 


ACAGTTG 


7 Lead 


49 SREBP-1 or"E tx)x" 


7139-7145 


TCAACTG 


7 Complement 


50 SREBP' 1 or"E box" 


7240-7246 


ACACCTG 


7 Complement 


51 SREBP-1 or "E box" 


7467-7473 


ACAGGTG 


7 Lead 


52 SREBP' 1 or -E box- 


7640-7646 


TCATTTG 


7 Lead 


53 SREBP-1 or-E box" 


7641-7647 


TCAAATG 


7 Complement 


54 SREBP-1 or "E box" 


7653-7659 


TCAGTTG 


7 Lead 


55 SREBP-1 or "E box" 


7654-7660 


ACAACTG 


7 Complement 


56 SREBP-1 or "E box" 


7735-7741 


ACAAATG 


7 Lead 


57 SREBP-1 or"E box- 


7838-7844 


TCAGGTG 


7 Complement 


SB SREBP-1 or'-E box" 


7880-7886 


TCATCTG 


7 Complement 


59 SREBP-1 or "E box- 


8051-8057 


TCAGGTG 


7 Lead 


so SREBP-1 or "E box" 


8052-8058 


TCAGCTG 


7 Complement 
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As a below named inventor, 1 hereby declare that: 

My residence, post office address and citizenship are as stated below next to my name. 

I believe 1 am the original, first and sole inventor (if only one name is listed below) or an original, first and 
joint inventor (if plural names are listed below) of the subject matter which is claimed and for which a 
patent is sought on the invention entitled METHODS AND REAGENTS FOR MODULATING 
CHOLESTEROL LEVELS, the specification of which 

■ is attached hereto. 

□ was filed on as Application Serial No. 

and was amended on . 

□ was described and claimed in PCT International Application No. 

filed on and as amended under PCT Article 19 on . 



I hereby state that 1 have reviewed and understand the contents of the above-identified specification, 
including the claims, as amended by any amendment referred to above, 

I acknowledge the duty to disclose all information I know to be material to patentability in accordance with 
Title 37, Code of Federal Regulations, §1 .56(a). 

FOREIGN PRIORITY RIGHTS: I hereby claim foreign priority benefits under Title 35, United States Code, 
§119 of any foreign application(s) for patent or inventor's certificate or of any PCT international 
appiication(s) designating at least one country other than the United States of America listed below and 
have also identified below any foreign application for patent or inventor's certificate or any PCT 
international application(s) designating at least one country other than the United States of America filed 
by me on the same subject matter having a filing date before that of the application(s) of which priority is 
claimed: 



Country 


Serial Number 


Filing Date 


Priority Claimed? 








Yes/No 



PROVISIONAL PRIORITY RIGHTS: I hereby claim priority benefits under Title 35, United States Code, 
§1 19(e) and §120 of any United States provisional patent application(s) listed below filed by an inventor or 
inventors on the same subject matter as the present application and having a filing date before that of the 
application(s) of which priority is claimed: 



Serial Number 


Filing Date 


Status 


60/124,702 


March 15, 1999 


Pending 


60/138,048 


June 8, 1999 


Pending 


60/139,600 


June 17, 1999 


Pending 


60/151,977 


September 1, 1999 


Pending 
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NON-PROVISIONAL PRIORITY RIGHTS: I hereby claim the benefit under Title 35, United States Code, 
§120 of any United States application(s) listed below and, insofar as the subject matter of each of the 
claims of this application is not disclosed in the prior United States application in the manner provided by 
the first paragraph of Title 35, United States Code, §1 1 2, 1 acknowledge the duty to disclose all information 
I know to be material to patentability as defined in Title 37, Code of Federal Regulations, §1 .56(a) which 
became available between the filing date of the prior application and the national or PCT international filing 
date of this application: 



Serial Number 


Filing Date 


Status 









I hereby appoint the following attorneys and/or agents to prosecute this application and to transact all 
business in the Patent and Trademark Office connected therewith: Paul T. Clark, Reg. No. 30,162, Karen 
L. Elbing, Ph.D. Reg. No. 35,238, Kristina Bieker-Brady, Ph.D. Reg. No. 39,109, Susan M. Michaud, Ph.D. 
Reg. No. 42,885, Mary Rose Scozzafava, Ph.D., Reg. No.36,268, James D. DeCamp, Ph D Req No 
43,580. 



Address all telephone calls to: Paul T. Clark at 617/428-0200. 

Address all correspondence to: Paul T. Ciark at Clark & Elbing LLP, 176 Federal Street Boston MA 
02110. 

I hereby declare that all statements made herein of my own knowledge are true and that ail statements 
made on information and belief are believed to be true; and further that these statements were made with 
the knowledge that willful false statements and the like so made are punishable by fine or imprisonment, 
or both, under Section 1001 of Title 18 of the United States Code and that such willful false statements ' 
may jeopardize the validity of the application or any patents issued thereon. 



Full Name 

(First, Middle, Last) 


Residence Address 
(City, State, Country) 


Post Office Address 
(Street, City, State, 
Country) 


Citizenship 


Michael R. Hayden 


Vancouver, British Columbia 
CANADA 


4484 West 7*^ Avenue 
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Columbia 

V6R 1W9 CANADA 
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CANADIAN 
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Date: 
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CANADA 
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Date: 
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Post Office Address 
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CANADIAN 






Columbia 


Vancouver, British Columbia 








CANADA 


V6T1C5 CANADA 
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Date: 
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SEQUENCE LISTING 

<110> Hayden, Michael R. 
Wilson, Angela R. 
Pirns tone, Simon N. 

<12 0> METHODS AND REAGENTS FOR MODULATING 
CHOLESTEROL LEVELS 

<130> 50110/002005 

<150> 60/124,702 
<151> 1999-03-15 

<150> 60/138,048 
<151> 1999-06-08 

<150> 60/139,600 
<151> 1999-06-17 

<150> 60/151,977 
<151> 1999-09-01 

<160> 287 

<170> FastSEQ for W 

<210> 1 
<211> 2261 
<212> PRT 

<213> Homo sapiens 
<400> 1 

Met Ala Cys Trp Pro 

1 5 
Phe Arg Arg Arg Gin 
20 

Leu Phe lie Phe Leu 
35 

Tyr Glu Gin His Glu 
50 

Gly Thr Leu Pro Trp 
65 

Cys Phe Arg Tyr Pro 
85 

Phe Asn Lys Ser lie 
100 

Leu Leu Tyr Ser Gin 
115 

Leu Arg Thr Leu Gin 



indows Version 4 . 0 



Gin Leu Arg Leu Leu Leu Trp Lys Asn Leu Thr 

10 15 
Thr Cys Gin Leu Leu Leu Glu Val Ala Trp Pro 

25 30 
lie Leu lie Ser Val Arg Leu Ser Tyr Pro Pro 

40 45 
Cys His Phe Pro Asn Lys Ala Met Pro Ser Ala 

55 60 
Val Gin Gly lie lie Cys Asn Ala Asn Asn Pro 
70 75 80 

Thr Pro Gly Glu Ala Pro Gly Val Val Gly Asn 

90 95 
Val Ala Arg Leu Phe Ser Asp Ala Arg Arg Leu 

105 110 
Lys Asp Thr Ser Met Lys Asp Met Arg Lys Val 

120 125 
Gin lie Lys Lys Ser Ser Ser Asn Leu Lys Leu 



1 



130 

Gin Asp Phe Leu 
145 

Asn Leu Ser Leu 

Val He Leu His 
180 

Ser Leu Cys Asn 
195 

Gin Glu Val Ser 
210 

Ala Glu Arg Val 
225 

Arg Thr Leu Asn 

Ala Thr Lys Thr 
260 

Phe Ser Met Arg 
275 

Thr Asn Val Asn 
290 

Ser Arg He Val 
305 

Ser Leu Asn Trp 

Asn Gly Thr Glu 
340 

Pro Tyr Cys Asn 
355 

Arg He He Trp 
370 

Tyr Thr Pro Asp 
385 

Lys Thr Phe Gin 

Glu Glu Leu Ser 
420 

Met Asp Leu Val 
435 

Trp Glu Gin Gin 
450 

Ala Phe Leu Ala 
465 

Val Tyr Thr Trp 

Thr He Ser Arg 
500 

He Ala Thr Glu 
515 

Glu Arg Lys Phe 
530 



135 

Val Asp Asn Glu 
150 

Pro Lys Ser Thr 
165 

Lys Val Phe Leu 

Gly Ser Lys Ser 
200 

Glu Leu Cys Gly 
215 

Leu Arg Ser Asn 
230 

Ser Thr Ser Pro 
245 

Leu Leu His Ser 

Ser Trp Ser Asp 
280 

Ser Ser Ser Ser 
295 

Cys Gly His Pro 
310 

Tyr Glu Asp Asn 
325 

Glu Asp Ala Glu 

Asp Leu Met Lys 
360 

Lys Ala Leu Lys 
375 

Thr Pro Ala Thr 
390 

Glu Leu Ala Val 
405 

Pro Lys He Trp 

Arg Met Leu Leu 
440 

Leu Asp Gly Leu 
455 

Lys His Pro Glu 
470 

Arg Glu Ala Phe 
485 

Phe Met Glu Cys 

Val Trp Leu He 
520 

Trp Ala Gly He 
535 



140 

Thr Phe Ser Gly 
155 

Val Asp Lys Met 
170 

Gin Gly Tyr Gin 
185 

Glu Glu Met He 

Leu Pro Arg Glu 
220 

Met Asp He Leu 
235 

Phe Pro Ser Lys 
250 

Leu Gly Thr Leu 
265 

Met Arg Gin Glu 

Ser Thr Gin He 
300 

Glu Gly Gly Gly 
315 

Asn Tyr Lys Ala 
330 

Thr Phe Tyr Asp 
345 

Asn Leu Glu Ser 

Pro Leu Leu Val 
380 

Arg Gin Val Met 
395 

Phe His Asp Leu 
410 

Thr Phe Met Glu 
425 

Asp Ser Arg Asp 

Asp Trp Thr Ala 
460 

Asp Val Gin Ser 
475 

Asn Glu Thr Asn 
490 

Val Asn Leu Asn 
505 

Asn Lys Ser Met 

Val Phe Thr Gly 
540 



Phe Leu Tyr His 
160 

Leu Arg Ala Asp 
175 

Leu His Leu Thr 
190 

Gin Leu Gly Asp 
205 

Lys Leu Ala Ala 

Lys Pro He Leu 
240 

Glu Leu Ala Glu 
255 

Ala Gin Glu Leu 
270 

Val Met Phe Leu 
285 

Tyr Gin Ala Val 

Leu Lys He Lys 
320 

Leu Phe Gly Gly 
335 

Asn Ser Thr Thr 
350 

Ser Pro Leu Ser 
365 

Gly Lys He Leu 

Ala Glu Val Asn 
400 

Glu Gly Met Trp 
415 

Asn Ser Gin Glu 
430 

Asn Asp His Phe 
445 

Gin Asp He Val 

Ser Asn Gly Ser 
480 

Gin Ala He Arg 
495 

Lys Leu Glu Pro 
510 

Glu Leu Leu Asp 
525 

He Thr Pro Gly 



Ser lie Glu Leu 
545 

Asp Asn Val Glu 

Gly Pro Arg Ala 
580 

Phe Ala Tyr Leu 
595 

Thr Gly Thr Glu 
610 

Pro Cys Tyr Val 
625 

Pro Leu Phe Met 

Lys Gly He Val 
660 

He Met Gly Leu 
675 

Ser Leu He Pro 
690 

Lys Leu Gly Asn 
705 

Phe Leu Ser Val 

Ser Thr Leu Phe 
740 

He Tyr Phe Thr 
755 

Asp Tyr Val Gly 
770 

Val Ala Phe Gly 
785 

Gly He Gly Val 

Asp Gly Phe Asn 
820 

Phe Leu Tyr Gly 
835 

Gin Tyr Gly He 
850 

Trp Phe Gly Glu 
865 

Lys Arg He Ser 

Leu Gly Val Ser 
900 

Lys Val Ala Val 
915 

Thr Ser Phe Leu 
930 

He Leu Thr Gly 



Pro His His Val 
550 

Arg Thr Asn Lys 
565 

Asp Pro Phe Glu 

Gin Asp Val Val 
600 

Lys Lys Thr Gly 
615 

Asp Asp He Phe 
630 

Thr Leu Ala Trp 
645 

Tyr Glu Lys Glu 

Asp Asn Ser He 
680 

Leu Leu Val Ser 
695 

Leu Leu Pro Tyr 
710 

Phe Ala Val Val 
725 

Ser Arg Ala Asn 

Leu Tyr Leu Pro 
760 

Phe Thr Leu Lys 
775 

Phe Gly Cys Glu 
790 

Gin Trp Asp Asn 
805 

Leu Thr Thr Ser 

Val Met Thr Trp 
840 

Pro Arg Pro Trp 
855 

Glu Ser Asp Glu 
870 

Glu He Cys Met 
885 

He Gin Asn Leu 

Asp Gly Leu Ala 
920 

Gly His Asn Gly 
935 

Leu Phe Pro Pro 



Lys Tyr Lys He 
555 

He Lys Asp Gly 
570 

Asp Met Arg Tyr 
585 

Glu Gin Ala He 

Val Tyr Met Gin 
620 

Leu Arg Val Met 
635 

He Tyr Ser Val 
650 

Ala Arg Leu Lys 
665 

Leu Trp Phe Ser 

Ala Gly Leu Leu 
700 

Ser Asp Pro Ser 
715 

Thr He Leu Gin 
730 

Leu Ala Ala Ala 
745 

Tyr Val Leu Cys 

He Phe Ala Ser 
780 

Tyr Phe Ala Leu 
795 

Leu Phe Glu Ser 
810 

Val Ser Met Met 
825 

Tyr He Glu Ala 

Tyr Phe Pro Cys 
860 

Lys Ser His Pro 
875 

Glu Glu Glu Pro 
890 

Val Lys Val Tyr 
905 

Leu Asn Phe Tyr 

Ala Gly Lys Thr 
940 

Thr Ser Gly Thr 



Arg Met Asp He 
560 

Tyr Trp Asp Pro 
575 

Val Trp Gly Gly 
590 

He Arg Val Leu 
605 

Gin Met Pro Tyr 

Ser Arg Ser Met 
640 

Ala Val He He 
655 

Glu Thr Met Arg 
670 

Trp Phe He Ser 
685 

Val Val He Leu 

Val Val Phe Val 
720 

Cys Phe Leu He 
735 

Cys Gly Gly He 
750 

Val Ala Trp Gin 
765 

Leu Leu Ser Pro 

Phe Glu Glu Gin 
800 

Pro Val Glu Glu 
815 

Leu Phe Asp Thr 
830 

Val Phe Pro Gly 
845 

Thr Lys Ser Tyr 

Gly Ser Asn Gin 
880 

Thr His Leu Lys 
895 

Arg Asp Gly Met 
910 

Glu Gly Gin He 
925 

Thr Thr Met Ser 
Ala Tyr He Leu 



945 950 955 960 

Gly Lys Asp lie Arg Ser Glu Met Ser Thr lie Arg Gin Asn Leu Gly 

965 970 975 

Val Cys Pro Gin His Asn Val Leu Phe Asp Met Leu Thr Val Glu Glu 

980 985 990 

His lie Trp Phe Tyr Ala Arg Leu Lys Gly Leu Ser Glu Lys His Val 

995 1000 1005 

Lys Ala Glu Met Glu Gin Met Ala Leu Asp Val Gly Leu Pro Ser Ser 

1010 1015 1020 

Lys Leu Lys Ser Lys Thr Ser Gin Leu Ser Gly Gly Met Gin Arg Lys 
1025 1030 1035 1040 

Leu Ser Val Ala Leu Ala Phe Val Gly Gly Ser Lys Val Val He Leu 

1045 1050 1055 

Asp Glu Pro Thr Ala Gly Val Asp Pro Tyr Ser Arg Arg Gly He Trp 

1060 1065 1070 

Glu Leu Leu Leu Lys Tyr Arg Gin Gly Arg Thr He He Leu Ser Thr 

1075 1080 1085 

His His Met Asp Glu Ala Asp Val Leu Gly Asp Arg He Ala He He 

1090 1095 1100 

Ser His Gly Lys Leu Cys Cys Val Gly Ser Ser Leu Phe Leu Lys Asn 
1105 1110 1115 1120 

Gin Leu Gly Thr Gly Tyr Tyr Leu Thr Leu Val Lys Lys Asp Val Glu 

1125 1130 1135 

Ser Ser Leu Ser Ser Cys Arg Asn Ser Ser Ser Thr Val Ser Tyr Leu 

1140 1145 1150 

Lys Lys Glu Asp Ser Val Ser Gin Ser Ser Ser Asp Ala Gly Leu Gly 

1155 1160 1165 

Ser Asp His Glu Ser Asp Thr Leu Thr He Asp Val Ser Ala He Ser 

1170 1175 1180 

Asn Leu He Arg Lys His Val Ser Glu Ala Arg Leu Val Glu Asp He 
1185 1190 1195 1200 

Gly His Glu Leu Thr Tyr Val Leu Pro Tyr Glu Ala Ala Lys Glu Gly 

1205 1210 1215 

Ala Phe Val Glu Leu Phe His Glu He Asp Asp Arg Leu Ser Asp Leu 

1220 1225 1230 

Gly He Ser Ser Tyr Gly He Ser Glu Thr Thr Leu Glu Glu He Phe 

1235 1240 1245 

Leu Lys Val Ala Glu Glu Ser Gly Val Asp Ala Glu Thr Ser Asp Gly 

1250 1255 1260 

Thr Leu Pro Ala Arg Arg Asn Arg Arg Ala Phe Gly Asp Lys Gin Ser 
1265 1270 1275 1280 

Cys Leu Arg Pro Phe Thr Glu Asp Asp Ala Ala Asp Pro Asn Asp Ser 

1285 1290 1295 

Asp He Asp Pro Glu Ser Arg Glu Thr Asp Leu Leu Ser Gly Met Asp 

1300 1305 1310 

Gly Lys Gly Ser Tyr Gin Val Lys Gly Trp Lys Leu Thr Gin Gin Gin 

1315 1320 1325 

Phe Val Ala Leu Leu Trp Lys Arg Leu Leu He Ala Arg Arg Ser Arg 

1330 1335 1340 

Lys Gly Phe Phe Ala Gin He Val Leu Pro Ala Val Phe Val Cys He 
1345 1350 1355 1360 



4 



Ala Leu Val Phe Ser Leu lie Val Pro Pro Phe Gly Lys Tyr Pro Ser 

1365 1370 1375 

Leu Glu Leu Gin Pro Trp Met Tyr Asn Glu Gin Tyr Thr Phe Val Ser 

1380 1385 1390 

Asn Asp Ala Pro Glu Asp Thr Gly Thr Leu Glu Leu Leu Asn Ala Leu 

1395 1400 1405 

Thr Lys Asp Pro Gly Phe Gly Thr Arg Cys Met Glu Gly Asn Pro lie 

1410 1415 1420 

Pro Asp Thr Pro Cys Gin Ala Gly Glu Glu Glu Trp Thr Thr Ala Pro 
1425 1430 1435 1440 

Val Pro Gin Thr lie Met Asp Leu Phe Gin Asn Gly Asn Trp Thr Met 

1445 1450 1455 

Gin Asn Pro Ser Pro Ala Cys Gin Cys Ser Ser Asp Lys lie Lys Lys 

1460 1465 1470 

Met Leu Pro Val Cys Pro Pro Gly Ala Gly Gly Leu Pro Pro Pro Gin 

1475 1480 1485 

Arg Lys Gin Asn Thr Ala Asp lie Leu Gin Asp Leu Thr Gly Arg Asn 

1490 1495 1500 

lie Ser Asp Tyr Leu Val Lys Thr Tyr Val Gin lie lie Ala Lys Ser 
1505 1510 1515 1520 

Leu Lys Asn Lys lie Trp Val Asn Glu Phe Arg Tyr Gly Gly Phe Ser 

1525 1530 1535 

Leu Gly Val Ser Asn Thr Gin Ala Leu Pro Pro Ser Gin Glu Val Asn 

1540 1545 1550 

Asp Ala lie Lys Gin Met Lys Lys His Leu Lys Leu Ala Lys Asp Ser 

1555 1560 1565 

Ser Ala Asp Arg Phe Leu Asn Ser Leu Gly Arg Phe Met Thr Gly Leu 

1570 1575 1580 

Asp Thr Arg Asn Asn Val Lys Val Trp Phe Asn Asn Lys Gly Trp His 
1585 1590 1595 1600 

Ala lie Ser Ser Phe Leu Asn Val lie Asn Asn Ala lie Leu Arg Ala 

1605 1610 1615 

Asn Leu Gin Lys Gly Glu Asn Pro Ser His Tyr Gly lie Thr Ala Phe 

1620 1625 1630 

Asn His Pro Leu Asn Leu Thr Lys Gin Gin Leu Ser Glu Val Ala Leu 

1635 1640 1645 

Met Thr Thr Ser Val Asp Val Leu Val Ser He Cys Val He Phe Ala 

1650 1655 1660 

Met Ser Phe Val Pro Ala Ser Phe Val Val Phe Leu He Gin Glu Arg 
1665 1670 1675 1680 

Val Ser Lys Ala Lys His Leu Gin Phe He Ser Gly Val Lys Pro Val 

1685 1690 1695 

He Tyr Trp Leu Ser Asn Phe Val Trp Asp Met Cys Asn Tyr Val Val 

1700 1705 1710 

Pro Ala Thr Leu Val He He He Phe He Cys Phe Gin Gin Lys Ser 

1715 1720 1725 

Tyr Val Ser Ser Thr Asn Leu Pro Val Leu Ala Leu Leu Leu Leu Leu 

1730 1735 1740 

Tyr Gly Trp Ser He Thr Pro Leu Met Tyr Pro Ala Ser Phe Val Phe 
1745 1750 1755 1760 

Lys He Pro Ser Thr Ala Tyr Val Val Leu Thr Ser Val Asn Leu Phe 



5 



1765 1770 1775 

lie Gly lie Asn Gly Ser Val Ala Thr Phe Val Leu Glu Leu Phe Thr 

1780 1785 1790 

Asp Asn Lys Leu Asn Asn lie Asn Asp lie Leu Lys Ser Val Phe Leu 

1795 1800 1805 

lie Phe Pro His Phe Cys Leu Gly Arg Gly Leu lie Asp Met Val Lys 

1810 1815 1820 

Asn Gin Ala Met Ala Asp Ala Leu Glu Arg Phe Gly Glu Asn Arg Phe 
1825 1830 1835 1840 

Val Ser Pro Leu Ser Trp Asp Leu Val Gly Arg Asn Leu Phe Ala Met 

1845 1850 1855 

Ala Val Glu Gly Val Val Phe Phe Leu He Thr Val Leu He Gin Tyr 

1860 1865 1870 

Arg Phe Phe He Arg Pro Arg Pro Val Asn Ala Lys Leu Ser Pro Leu 

1875 1880 1885 

Asn Asp Glu Asp Glu Asp Val Arg Arg Glu Arg Gin Arg He Leu Asp 

1890 1895 1900 

Gly Gly Gly Gin Asn Asp He Leu Glu He Lys Glu Leu Thr Lys He 
1905 1910 1915 1920 

Tyr Arg Arg Lys Arg Lys Pro Ala Val Asp Arg He Cys Val Gly He 

1925 1930 1935 

Pro Pro Gly Glu Cys Phe Gly Leu Leu Gly Val Asn Gly Ala Gly Lys 

1940 1945 1950 

Ser Ser Thr Phe Lys Met Leu Thr Gly Asp Thr Thr Val Thr Arg Gly 

1955 1960 1965 

Asp Ala Phe Leu Asn Lys Asn Ser He Leu Ser Asn He His Glu Val 

1970 1975 1980 

His Gin Asn Met Gly Tyr Cys Pro Gin Phe Asp Ala He Thr Glu Leu 
1985 1990 1995 2000 

Leu Thr Gly Arg Glu His Val Glu Phe Phe Ala Leu Leu Arg Gly Val 

2005 2010 2015 

Pro Glu Lys Glu Val Gly Lys Val Gly Glu Trp Ala He Arg Lys Leu 

2020 2025 2030 

Gly Leu Val Lys Tyr Gly Glu Lys Tyr Ala Gly Asn Tyr Ser Gly Gly 

2035 2040 2045 

Asn Lys Arg Lys Leu Ser Thr Ala Met Ala Leu He Gly Gly Pro Pro 

2050 2055 2060 

Val Val Phe Leu Asp Glu Pro Thr Thr Gly Met Asp Pro Lys Ala Arg 
2065 2070 2075 2080 

Arg Phe Leu Trp Asn Cys Ala Leu Ser Val Val Lys Glu Gly Arg Ser 

2085 2090 2095 

Val Val Leu Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr 

2100 2105 2110 

Arg Met Ala He Met Val Asn Gly Arg Phe Arg Cys Leu Gly Ser Val 

2115 2120 2125 

Gin His Leu Lys Asn Arg Phe Gly Asp Gly Tyr Thr He Val Val Arg 

2130 2135 2140 

He Ala Gly Ser Asn Pro Asp Leu Lys Pro Val Gin Asp Phe Phe Gly 
2145 2150 2155 2160 

Leu Ala Phe Pro Gly Ser Val Leu Lys Glu Lys His Arg Asn Met Leu 
2165 2170 2175 
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Gin Tyr Gin Leu Pro Ser Ser Leu Ser Ser Leu Ala Arg lie Phe Ser 

2180 2185 2190 

lie Leu Ser Gin Ser Lys Lys Arg Leu His lie Glu Asp Tyr Ser Val 

2195 2200 2205 

Ser Gin Thr Thr Leu Asp Gin Val Phe Val Asn Phe Ala Lys Asp Gin 

2210 2215 2220 

Ser Asp Asp Asp His Leu Lys Asp Leu Ser Leu His Lys Asn Gin Thr 
2225 2230 2235 2240 

Val Val Asp Val Ala Val Leu Thr Ser Phe Leu Gin Asp Glu Lys Val 

2245 2250 2255 

Lys Glu Ser Tyr Val 
2260 

<210> 2 

<211> 7864 

<212> DNA 

<213> Homo sapiens 

<400> 2 

gtccctgctg tgagctctgg ccgctgcctt ccagggctcc cgagccacac gctgggggtg 60 

ctggctgagg gaacatggct tgttggcctc agctgaggtt gctgctgtgg aagaacctca 12 0 

ctttcagaag aagacaaaca tgtcagctgt tactggaagt ggcctggcct ctatttatct 180 

tcctgatcct gatctctgtt cggctgagct acccacccta tgaacaacat gaatgccatt 24 0 

ttccaaataa agccatgccc tctgcaggaa cacttccttg ggttcagggg attatctgta 300 

atgccaacaa cccctgtttc cgttacccga ctcctgggga ggctcccgga gttgttggaa 3 60 

actttaacaa atccattgtg gctcgcctgt tctcagatgc tcggaggctt cttttataca 42 0 

gccagaaaga caccagcatg aaggacatgc gcaaagttct gagaacatta cagcagatca 4 80 

agaaatccag ctcaaacttg aagcttcaag atttcctggt ggacaatgaa accttctctg 54 0 

ggttcctgta tcacaacctc tctctcccaa agtctactgt ggacaagatg ctgagggctg 60 0 

atgtcattct ccacaaggta tttttgcaag gctaccagtt acatttgaca agtctgtgca 660 

^tggatcaaa atcagaagag atgattcaac ttggtgacca agaagtttct gagctttgtg 72 0 

gcctaccaag ggagaaactg gctgcagcag agcgagtact tcgttccaac atggacatcc 780 

tgaagccaat cctgagaaca ctaaactcta catctccctt cccgagcaag gagctggctg 84 0 

aagccacaaa aacattgctg catagtcttg ggactctggc ccaggagctg ttcagcatga 90 0 

gaagctggag tgacatgcga caggaggtga tgtttctgac caatgtgaac agctccagct 960 

cctccaccca aatctaccag gctgtgtctc gtattgtctg cgggcatccc gagggagggg 1020 

ggctgaagat caagtctctc aactggtatg aggacaacaa ctacaaagcc ctctttggag 10 8 0 

gcaatggcac tgaggaagat gctgaaacct tctatgacaa ctctacaact ccttactgca 1140 

atgatttgat gaagaatttg gagtctagtc ctctttcccg cattatctgg aaagctctga 12 00 

agccgctgct cgttgggaag atcctgtata cacctgacac tccagccaca aggcaggtca 12 60 

tggctgaggt gaacaagacc ttccaggaac tggctgtgtt ccatgatctg gaaggcatgt 132 0 

99gagga-act cagccccaag atctggacct tcatggagaa cagccaagaa atggaccttg 13 8 0 

tccggatgct gttggacagc agggacaatg accacttttg ggaacagcag ttggatggct 144 0 

tagattggac agcccaagac atcgtggcgt ttttggccaa gcacccagag gatgtccagt 15 0 0 

ccagtaatgg ttctgtgtac acctggagag aagctttcaa cgagactaac caggcaatcc 1560 

ggaccatatc tcgcttcatg gagtgtgtca acctgaacaa gctagaaccc atagcaacag 162 0 

aagtctggct catcaacaag tccatggagc tgctggatga gaggaagttc tgggctggta 168 0 

ttgtgttcac tggaattact ccaggcagca ttgagctgcc ccatcatgtc aagtacaaga 174 0 

tccgaatgga cattgacaat gtggagagga caaataaaat caaggatggg tactgggacc 18 0 0 

ctggtcctcg agctgacccc tttgaggaca tgcggtacgt ctgggggggc ttcgcctact 1860 

tgcaggatgt ggtggagcag gcaatcatca gggtgctgac gggcaccgag aagaaaactg 192 0 
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gdgtgtctat atgcaacaga tgccctatcc ctgttacgtt gatgacatct ttctgcgggt 1980 

gatgagccgg tcaatgcccc tcttcatgac gctggcctgg atttactcag tggctgtgat 2 04 0 

catcaagggc atcgtgtatg agaaggaggc acggctgaaa gagaccatgc ggatcatggg 2100 

cctggacaac agcatcctct ggtttagctg gttcattagt agcctcattc ctcttcttgt 2160 

gagcgctggc ctgctagtgg tcatcctgaa gttaggaaac ctgctgccct acagtgatcc 222 0 

cagcgtggtg tttgtcttcc tgtccgtgtt tgctgtggtg acaatcctgc agtgcttcct 22 80 

gattagcaca ctcttctcca gagccaacct ggcagcagcc tgtgggggca tcatctactt 2340 

cacgctgtac ctgccctacg tcctgtgtgt ggcatggcag gactacgtgg gcttcacact 2400 

caagatcttc gctagcctgc tgtctcctgt ggcttttggg tttggctgtg agtactttgc 2460 

cctttttgag gagcagggca ttggagtgca gtgggacaac ctgtttgaga gtcctgtgga 2 52 0 

ggaagatggc ttcaatctca ccacttcggt ctccatgatg ctgtttgaca ccttcctcta 2580 

tggggtgatg acctggtaca ttgaggctgt ctttccaggc cagtacggaa ttcccaggcc 2 64 0 

ctggtatttt ccttgcacca agtcctactg gtttggcgag gaaagtgatg agaagagcca 2700 

ccctggttcc aaccagaaga gaatatcaga aatctgcatg gaggaggaac ccacccactt 2 7 60 

gaagctgggc gtgtccattc agaacctggt aaaagtctac cgagatggga tgaaggtggc 2 82 0 

tgtcgatggc ctggcactga atttttatga gggccagatc acctccttcc tgggccacaa 2880 

tggagcgggg aagacgacca ccatgtcaat cctgaccggg ttgttccccc cgacctcggg 2 94 0 

caccgcctac atcctgggaa aagacattcg ctctgagatg agcaccatcc ggcagaacct 3 000 

99gggtctgt ccccagcata acgtgctgtt tgacatgctg actgtcgaag aacacatctg 3 060 

gttctatgcc cgcttgaaag ggctctctga gaagcacgtg aaggcggaga tggagcagat 312 0 

ggccctggat gttggtttgc catcaagcaa gctgaaaagc aaaacaagcc agctgtcagg 318 0 

tggaatgcag agaaagctat ctgtggcctt ggcctttgtc gggggatcta aggttgtcat 3240 

tctggatgaa cccacagctg gtgtggaccc ttactcccgc aggggaatat gggagctgct 33 00 

gctgaaatac cgacaaggcc gcaccattat tctctctaca caccacatgg atgaagcgga 3360 

cgtcctgggg gacaggattg ccatcatctc ccatgggaag ctgtgctgtg tgggctcctc 3420 

cctgtttctg aagaaccagc tgggaacagg ctactacctg accttggtca agaaagatgt 3480 

ggaatcctcc ctcagttcct gcagaaacag tagtagcact gtgtcatacc tgaaaaagga 3 54 0 

ggacagtgtt tctcagagca gttctgatgc tggcctgggc agcgaccatg agagtgacac 3 60 0 

gctgaccatc gatgtctctg ctatctccaa cctcatcagg aagcatgtgt ctgaagcccg 3 660 

gctggtggaa gacatagggc atgagctgac ctatgtgctg ccatatgaag ctgctaagga 3 72 0 

gggagccttt gtggaactct ttcatgagat tgatgaccgg ctctcagacc tgggcatttc 3780 

tagttatggc atctcagaga cgaccctgga agaaatattc ctcaaggtgg ccgaagagag 3 84 0 

tggggtggat gctgagacct cagatggtac cttgccagca agacgaaaca ggcgggcctt 3 90 0 

cggggacaag cagagctgtc ttcgcccgtt cactgaagat gatgctgctg atccaaatga 3 960 

ttgctgacat agacccagaa tccagagaga cagacttgct cagtgggatg gatggcaaag 4 02 0 

ggtcctacca ggtgaaaggc tggaaactta cacagcaaca gtttgtggcc cttttgtgga 4080 

agagactgct aattgccaga cggagtcgga aaggattttt tgctcagatt gtcttgccag 414 0 

ctgtgtttgt ctgcattgcc cttgtgttca gcctgatcgt gccacccttt ggcaagtacc 42 00 

ccagcctgga acttcagccc tggatgtaca acgaacagta cacatttgtc agcaatgatg 4260 

ctcctgagga cacgggaacc ctggaactct taaacgccct caccaaagac cctggcttcg 432 0 

ggacccgctg tatggaagga aacccaatcc cagacacgcc ctgccaggca ggggaggaag 43 8 0 

agtggaccac tgccccagtt ccccagacca tcatggacct cttccagaat gggaactgga 4440 

caatgcagaa cccttcacct gcatgccagt gtagcagcga caaaatcaag aagatgctgc 4500 

ctgtgtgtcc cccaggggca ggggggctgc ctcctccaca aagaaaacaa aacactgcag 45 60 

atatccttca ggacctgaca ggaagaaaca tttcggatta tctggtgaag acgtatgtgc 462 0 

agatcatagc caaaagctta aagaacaaga tctgggtgaa tgagtttagg tatggcggct 468 0 

tttccctggg tgtcagtaat actcaagcac ttcctccgag tcaagaagtt aatgatgcca 474 0 

tcaaacaaat gaagaaacac ctaaagctgg ccaaggacag ttctgcagat cgatttctca 4800 

acagcttggg aagatttatg acaggactgg acaccagaaa taatgtcaag gtgtggttca 4 8 60 

ataacaaggg ctggcatgca atcagctctt tcctgaatgt catcaacaat gccattctcc 4 92 0 

gggccaacct gcaaaaggga gagaacccta gccattatgg aattactgct ttcaatcatc 4980 
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ccctgaatct caccaagcag cagctctcag aggtggctct gatgaccaca tcagtggatg 5040 

tccttgtgtc catctgtgtc atctttgcaa tgtccttcgt cccagccagc tttgtcgtat 5100 

tcctgatcca ggagcgggtc agcaaagcaa aacacctgca gttcatcagt ggagtgaagc 5160 

ctgtcatcta ctggctctct aattttgtct gggatatgtg caattacgtt gtccctgcca 5220 

cactggtcat tatcatcttc atctgcttcc agcagaagtc ctatgtgtcc tccaccaatc 5280 

tgcctgtgct agcccttcta cttttgctgt atgggtggtc aatcacacct ctcatgtacc 5340 

cagcctcctt tgtgttcaag atccccagca cagcctatgt ggtgctcacc agcgtgaacc 54 0 0 

tcttcattgg cattaatggc agcgtggcca cctttgtgct ggagctgttc accgacaata 54 60 

agctgaataa tatcaatgat atcctgaagt ccgtgttctt gatcttccca catttttgcc 5520 

tgggacgagg gctcatcgac atggtgaaaa accaggcaat ggctgatgcc ctggaaaggt 558 0 

ttggggagaa tcgctttgtg tcaccattat cttgggactt ggtgggacga aacctcttcg 564 0 

ccatggccgt ggaaggggtg gtgttcttcc tcattactgt tctgatccag tacagattct 5700 

tcatcaggcc cagacctgta aatgcaaagc tatctcctct gaatgatgaa gatgaagatg 57 60 

tgaggcggga aagacagaga attcttgatg gtggaggcca gaatgacatc ttagaaatca 5 82 0 

aggagttgac gaagatatat agaaggaagc ggaagcctgc tgttgacagg atttgcgtgg 58 8 0 

gcattcctcc tggtgagtgc tttgggctcc tgggagttaa tggggctgga aaatcatcaa 5940 

ctttcaagat gttaacagga gataccactg ttaccagagg agatgctttc cttaacaaaa 6000 

ataggtatct tatcaaacat ccatgaagta catcagaaca tgggctactg ccctcagttt 6060 

gatgccatca cagagctgtt gactgggaga gaacacgtgg agttctttgc ccttttgaga 612 0 

ggagtcccag agaaagaagt tggcaaggtt ggtgagtggg cgattcggaa actgggcctc 618 0 

gtgaagtatg gagaaaaata tgctggtaac tatagtggag gcaacaaacg caagctctct 624 0 

acagccatgg ctttgatcgg cgggcctcct gtggtgtttc tggatgaacc caccacaggc 6300 

atggatccca aagcccggcg gttcttgtgg aattgtgccc taagtgttgt caaggagggg 63 6 0 

agatcagtag tgcttacatc tcatagtatg gaagaatgtg aagctctttg cactaggatg 642 0 

gcaatcatgg tcaatggaag gttcaggtgc cttggcagtg tccagcatct aaaaaatagg 64 8 0 

tttggagatg gttatacaat agttgtacga atagcagggt ccaacccgga cctgaagcct 654 0 

gtccaggatt tctttggact tgcatttcct ggaagtgttc taaaagagaa acaccggaac 6600 

atgctacaat accagcttcc atcttcatta tcttctctgg ccaggatatt cagcatcctc 6660 

tcccagagca aaaagcgact ccacatagaa gactactctg tttctcagac aacacttgac 672 0 

caagtatttg tgaactttgc caaggaccaa agtgatgatg accacttaaa agacctctca 67 8 0 

ttacacaaaa accagacagt agtggacgtt gcagttctca catcttttct acaggatgag 684 0 

aaagtgaaag aaagctatgt atgaagaatc ctgttcatac ggggtggctg aaagtaaaga 6900 

ggaactagac tttcctttgc accatgtgaa gtgttgtgga gaaaagagcc agaagttgat 696 0 

gtgggaagaa gtaaactgga tactgtactg atactattca atgcaatgca attcaatgca 7 02 0 

atgaaaacaa aattccatta caggggcagt gcctttgtag cctatgtctt gtatggctct 7080 

caagtgaaag acttgaattt agttttttac ctatacctat gtgaaactct attatggaac 714 0 

ccaatggaca tatgggtttg aactcacact tttttttttt tttttgttcc tgtgtattct 72 00 

cattggggtt gcaacaataa ttcatcaagt aatcatggcc agcgattatt gatcaaaatc 72 60 

aaaaggtaat gcacatcctc attcactaag ccatgccatg cccaggagac tggtttcccg 7320 

gtgacacatc cattgctggc aatgagtgtg ccagagttat tagtgccaag tttttcagaa 7380 

agtttgaagc accatggtgt gtcatgctca cttttgtgaa agctgctctg ctcagagtct 744 0 

atcaacattg aatatcagtt gacagaatgg tgccatgcgt ggctaacatc ctgctttgat 7 50 0 

tccctctgat aagctgttct ggtggcagta acatgcaaca aaaatgtggg tgtctccagg 756 0 

cacgggaaac ttggttccat tgttatattg tcctatgctt cgagccatgg gtctacaggg 762 0 

tcatccttat gagactctta aatatactta gatcctggta agaggcaaag aatcaacagc 768 0 

caaactgctg gggctgcaac tgctgaagcc agggcatggg attaaagaga ttgtgcgttc 774 0 

aaacctaggg aagcctgtgc ccatttgtcc tgactgtctg ctaacatggt acactgcatc 7 800 

tcaagatgtt tatctgacac aagtgtatta tttctggctt tttgaattaa tctagaaaat 7 86 0 

gaaa 7864 
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<211> 22 

<212> DNA 

<213> Homo sapiens 



<400> 3 

gcagagggca tggctttatt tg 

<210> 4 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 4 

ctgccaggca ggggaggaag agtg 

<210> 5 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 5 

gaaagtgact cacttgtgga gga 

<210> 6 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 6 

aaaggggctt ggtaagggta 

<210> 7 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 7 

catgcacatg cacacacata 

<210> 8 

<211> 27 

<212> DNA 

<213> Homo sapiens 

<400> 8 

ctttctgcgg gtgatgagcc ggtcaat 



<210> 9 
<211> 20 
<212> DNA 



<213> Homo sapiens 



<400> 9 



ccttagcccg tgttgagcta 



20 



<210> 10 

<211> 26 

<212> DNA 

<213> Homo sapiens 

<400> 10 

cctgtaaatg caaagctatc tcctct 26 

<210> 11 

<211> 26 

<212> DNA 

<213> Homo sapiens 

<400> 11 

cgtcaactcc ttgatttcta agatgt 26 

<210> 12 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 12 

gggttcccaig ggttcagtat 2 0 

<210> 13 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 13 

gatcaggaat tcaagcacca a 21 

<210> 14 

<211> 10545 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc__f eature 

<222> (1) . . . (10545) 

<223> n = a, t, c, or g 



<400> 14 



acctcttata gaatgataga attcctctgg aatgattgga taacttcatt tcatccttga 
cttttacctt ggaggatttc ttaccccttt tggcttctca aatttgacta ttaaaatgtt 
gcctttaaaa ataggaacac agtttcaggg gggagtacca gcccatgacc cttctgcaag 



60 
120 



180 
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gccccctaac tcaaggtagt ttccctggaa ctgtggttta tggaatgttt caggagtgtg 24 0 

aggaggtata atttaaggct gtcctagcaa ggataccctt aaggatagag ggcccagtag 3 00 

catctggagg ccagaaaagt taaactgagg cagtcagatt agcttcaggc tcaattaagc 3 60 

tgatgggtca gcctgggaga aattgcagga tgactctcaa tatcccctcc cacccccaca 42 0 

gcagccacga tctgtctgtc tttaatcatg ggtgcagtga acctgttctt tccaggtgtc 480 

ttggccttca gtaaccttgt taggcttgtc cctgaacgtg gctaccgatc caaagacaca 54 0 

tgatcagaga ggcaattaga gaacagacct tttccaaagc aagcatgttc tgttgggctt 60 0 

agaagtttca tgtcctaata ttataggacc ctgtgcatct ctctggagat gaggcacatg 660 

agtcatatct gtgattcttg cttttgtgtc aacatctcat gaataggcaa tcagagcttt 72 0 

ggcaccaatg tattttcagt tcatatctga tgtagttaaa tccacctcct gctttgtagt 780 

ttactggcaa gctgtttttg atataagaca tctagaacac tgtaaatata taacattttt 840 

atttgtctat tatacctcaa ttacgaaaaa gacatctaga agcaacctca tcaagagaga 90 0 

tactgaggcc gggcatggta gctcacactt gcaatcccat tactttggga ggctgaggca 96 0 

ggtagatcac ttgaggtcaa gagtttgaaa ccagcctggc caacatgttg aaaccctgtc 102 0 

tctattaaaa atacaaaaaa gttagctggg cttggtggtg ggcacctgta atcccagcta 1080 

ctccggaggc tgaggcagga gaatcacttg aacctgggag gcagaggttg cagtgagctg 1140 

agatcacacc actgcactcc aacctgggca ccagagtgag attacatcta aaaaataaaa 12 00 

taaagtaata aaaaagagag atattgatag ctgttgttgg aaatttcaac ttccatctca 12 60 

cttctggtaa ctttttggaa gtttgttgaa caaagtggaa tacacgcaca tacacacaca 13 2 0 

cacatactct cttgtttgtt taaggtttaa tgaaatagct gtcatataat cactgttttt 13 80 

gaaagaggag aattagttgc tatctgtaca ttttgggtat gtgaactatt tggatagaac 144 0 

tctgagaaat gcattcagaa caacaaacaa aatcatagga gaaatagcta agtgggaagg 15 0 0 

ggcatataag agttgttgaa aaagttattt cttgagaaac cagctctaat gctaggcaag 156 0 

tcacttgctt tgggggaggc ctcagcttct ctgtctataa gattgcagca ggggtgtagt 162 0 

gggaatgagt cttcaacatt ccaagagatt ttatctacta atacgacagt caaatggagc 1680 

atgactttgt ggaagcctct cctcttccac ccagaggggc caatttctct gtcccagtga 174 0 

gatgttgaca cttgtatgat ccctgcttgg agacttccct cttctggaac ctgccctggc 1800 

tcaggcatga gggctgactg tcacccttcg ataggagccc agcactaaag ctcatgtgtt 1860 

ggcagtgttc ttgcgggaag gaaaaagacc agccagccca tttgttactg cacaagcaaa 192 0 

cagcttctgg tagctgtaca gatacatgca ctttctttcc tcactgtgtt tccatagaca 1980 

gatttagtgc tgtagaagag tagagggcag tcacgggaag gagttcctgt ttttcttttg 2 04 0 

gctatgccaa atggggaaaa atcctcctat cttgtctttt tagtgtcatc ctctctcccc 2100 

ttttcttctt ctttataatt ctcatctctc atctctcctg gaaatgtgca tgtcaagttc 2160 

aaaagggcac aatgttttgg tgaggaagag gtgggagaac acgtgccagg tgctaactag 22 2 0 

ggtcatcatt tcccccttca cagccagctt cctgtgaatg tgtgtgtgtg tgtgtgtgtg 22 8 0 

tgtgtgtgtg tgtgtgtgtg tgtgtatttc ttttgccagc atcactgaat ctgtctgctg 2 34 0 

tctggtattc caggttttgg tttagggaaa agtaaaagta attttataat cccagctgtc 24 0 0 

atttaagcca cccctttgtg ggtagcatat ggtccactct ctcagttcat tgtcctaaag 2460 

atgcttcatc agaaaggaat aacttccacc ccgttactct ctgtcccctt actctgcttt 2520 

atttttcttc gtcaatccta ccaccaccac ccactgtttg aacaacccac tattatttgt 2580 

ctgtttccca tccctggtag aataggagcc ccatgaatga aggaactttg cttctgttgt 2 640 

tcaccactga atctctaagg tatggaacac acctggcatg tgataggcac tcgataaata 2 70 0 

tttgttgtgg ctcatgggca ccttgcagag ttaaggctgc agttgtttgt ggaatttata 2 760 

^Qtggtaatg aatatttatc tactattcct cttccaaggc gatcacacaa taatcaggct 2 82 0 

ttacactatc cagttcttag gtcttccaag ttatgacttg tgaggtatgt taattatgat 2 880 

aatagaaggc agtttatttg gttcagattt attgatgtgt aatttaccac agtaagactt 2 94 0 

cccctttaca aaagtatgat gagttttgac aaatggatac acatgtgtat ctaccactgc 3 000 

catgctcctt ttcagtctgt cgtcccctcc acccatgacc actggtcacc actgcagtga 3 060 

tttctgtccc cttcatttca ccttttccag aatgtcatat aaatggaatc atgcagtatg 3120 

tagttttttg tgtctggctt atttttctta gcattaggct tttgggattc atccaggttg 3180 

tcgcatgtaa cagtagctta ttccttttta tggctgagta agtgtcccag ttttatttat 3240 
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atatttattt atgaggaggt gtctcactct gtcacccagg ctggagtgcg gtagcgcgat 33 00 

ctcagctcac tgcaacctcc gcctcccagg ttcaagcaat tctcctgcct cctgagtagc 3360 

tgggattaca ggcacccacc gccacgccca actaattttt atatttttag tagagatggg 342 0 

gtttcaccat gttggccagg ctgatctcaa actcttgacc tcaggtgatc cgcccacctc 3480 

tggctcccaa agtgctagga ttacaggcat gagccactgt gcccagcccc agttttattt 3 54 0 

attcaccagt tgatggtctt ttcgacaact aattgtttcc agtttttggc tattctgtat 3 60 0 

aaggcttcta taaatattca caaataccta ggatgggatg actgggtcat ataatagtac 3 660 

tgtataacct tagcagaaac tgtcaaacta ttttccaaag tggctcttcc attttacaat 3720 

tccacagtgt attgagtccc agtgtctcca tacacatgct agcactttta atatttaatt 3780 

tagtgggtat gtaatgatat ctcattgtgg ttttaatttg catttctctg cagctaatga 3 84 0 

tgagtgtttc tgcttatttg ggaaggtttt aatttagcag tctgttgtat tctgtagata 3 900 

ttaataactt caaaatatca gtggcatttg cagttaaaat ttccttaaaa aattggccaa 3 96 0 

aggtttccag cagtcacttc tgccatgccc aaactgtatg aaacaaggct gaggtgtgga 4020 

gattgtcaca ttttggcaag gagtgatcca cttgggtgac tgatgagacc cagagagcgt 4 08 0 

acgcctcggg cttgagggtg aggacgggcg ggaagtcgac tgcatggccc tgctggcctt 414 0 

gggaggctgc ccagtcctta gctaaagctg gcagttatgg gaaacagact tagattctat 42 0 0 

tacgtttttc aggatgtccc aggagtcacc tgggaagctc agcagtcctt tgtgactttc 42 60 

aagcatatgg tagaagctgc tgaacacaga gctccctctt tggggataat ttgcccaaat 4320 

catttaatca ggcttgagaa atgagttacc acaggtccag gagtgctgcc acccttgaat 4380 

tctgacaccc tatttctcct atccgtctct taattaatta agcagacatc cccaagtgct 4440 

tacgacaagc caggaccctt ttgcatacta aggaaaacag ggatgaagga aacagaaatg 45 00 

gtctctgctc tgactcagaa ggtagaaatc ctctttccca gccaagtctt cctagggagc 4560 

acgtaggaag ggctctgaac ccacgtgtca gttgcagggg aggatatcag gaaaggacat 462 0 

tgaagaagtg gagacctaag tttgagacct aggcattagc caggctagca gtgcttgaaa 468 0 

aagtgtctta ggacaagaga actcaccagt gaagtcccag tggtaggaga gcgtgcagca 4 74 0 

tattctgagc ctgtatacac atctccaggg cattgcttag caggtgggga gtggcaagag 4 80 0 

agtaggctgg agtcacagaa gggaggccag gtagaccttg gtgagcactg gactctatgt 4 86 0 

tcaggtgctg aggagctggc aaaaggtttt aagtcgggga gaggcatgtt cagatatttg 4 92 0 

gtctagctga gtaactttgg gtgctctgtg acaaatggtt gggagaccag tgaggtggca 4980 

gttgcggtca tctaggagca ggatcagagt ggcctattga ctgggatgac tgtgaagtgg 5040 

gatcctttcc agccagtaac tggaaatgtg tatgagggca gaagtgagtg tactgcattt 5100 

gaaacattga gaaatctagt acatagtact gtctctttta tatctttttt tttttttttt 5160 

ttgattttgg tttgtttgtt cactaacttg gaaaactgat gtggaaatgt ccctttggct 522 0 

tcagttacct gagcagaagg ggccgggcat tgccaaactc tcctcttagg acagaattgc 52 8 0 

tcccagtatt gatcattgtg ttctgagttg ggggagcaaa ttgtgcagga ggccaggtca 534 0 

gtgccaaggt gggtgggagg aattggagca ggaagcttgc ctaagtgtgc ccagcaaagc 5400 

cacggtagaa ctttctactg tggctctatg ctacttctta gcaaccttct ccatgtgctt 5460 

cctggagagt ccttggagtc agaacctttt tcttgaaacc cagacacttt acttccaaga 552 0 

aaatgctgtc caagaaaact catccttccc ttcttctcat gaacgttgtg tagaggtgtg 55 8 0 

tcttctcttc ctttgagctt ttccactcag ggtttagggg aggtgatatt ctatatttgg 5640 

gtttggctct gggtactgca acactaggct attaagattt catccttact gctttgcccc 5700 

tcctatcttt ccagaaaccc acaatggatt tgctagaaat aatggaacgt cctgtttgga 5760 

caggatataa ccatttctca gctagaggat attgttggaa tgaagaaaga taaatgggga 5 82 0 

gaagggaact cacattgctt tggcacttaa attaagccat gtactgtgtt gggaaattat 5880 

ttatattatc tcgttgaatc cacagtagaa cacagttgaa caccatacaa ggtaagtatt 5940 

gtcatcctta ttttaccatg aggaaattga tgcttagaga gcataaagcc ttggccaggg 6000 

gcacatagtt gggaagccgg ggctaattca tgcctgggct ctttctgata gttttccttt 6060 

tttaattgtc ccctcctcat tgttaccttg gggatttcaa gagattcatg tagcttctaa 6120 

atcaacgaac tgattcctgg agagcagctt ctgtatgaga aaaatctagc taattattta 6180 

tttcagtgtc tctggaatgc aagctctgtc ctgagccact tagaaaacaa tttgggatga 624 0 

caagcatgtg tctcacaatg ctgctctggt tgccagtgct gtgctgccag ttgtcatctt 63 00 
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tgaacaaact gatgcagtgc tggtttaact cttcctcttt ttggagtaag aaactttgga 63 60 

ggcctgtgtc cttctagaag tttgctgagc aaatggtaag gaaaagaaat aggtcctaag 642 0 

gcttgactat ttcagagaat ttcttgattt attggactgt caatgaatga attggaatac 6480 

atagtggtag gctgtctttt cttctcagac actgcaattt cctccaatct cttgactttt 6540 

ctagaagttt taatccaagt ccttgttggg tggtagataa aagggtattg ttctactaga 66 0 0 

gactgacctt ggcatggaga tctcatttgg actcacagat ttctagtcta gcgcttggtt 6660 

ttgtatccat acctcgctac tgcattctta gttccttctg ctccttgttc ctcatgccca 6720 

gtgtcccacc ctacccttgc ccctactcct ctagaggcca cagtgattca ctgagccatt 67 8 0 

tcataagcac agctaggaga gttcatggct accaagtgcc agcagggccg aattttcacc 684 0 

tgtgtgtcct cccttccatt tttcatcttc tgccccctcc ccagctttaa ctttaatata 6900 

actacttggg actattccag cattaaataa gggtaactgc tggatgggtg gctgggatac 6960 

acagaatgta gtatcccttg ttcacgagaa gaccttcttg ccctagcatg gcaaacagtc 702 0 

ctccaaggag gcacctgtga cacccaacgg agtagggggg cggtgtgttc aggtgcaggt 7 08 0 

ggaacaaggc cagaagtgtg catatgtgct gaccatggga gcttgtttgt cggtttcaca 714 0 

gttgatgccc tgagcctgcc atagcagact tgtttctcca tgggatgctg ttttctttcc 7200 

agagacacag cgctagggtt gtcctcatta cctgagagcc aggtgtcggt agcattttct 72 60 

tggtgtttac tcacactcat ctaaggcacg ttgtggtttt ccagattagg aaactgcttt 732 0 

attgatggtg cttttttttt ttttttttga gacagagtct cgctctgtcg ccatgctgga 7380 

gtgtagtggc acaatcttgg ctcactgcac ctccgcctgc caggttcagc gattctcctg 744 0 

cctcagcctc ccaagtagct gggactacag gtgcctgcca ccatgcccag ctaatttttg 75 00 

tatttttagt agagacgggg tttcaccgta ttggctagga tggtctcgat ttcttgacct 756 0 

cgtgatccgc ctgcctcggc ctcccaaagt gctgggatta taggcttgag ccaccacgcc 7 62 0 

tggccgatgg tgctttttat catttgaagg actcagttgt ataacccact gaaaattagt 7680 

atgtaaggaa gttcagggaa tagtataagt cactccaggc ttgaggcaaa atttacaaat 7740 

gctgctgact ttgtatgtaa ggggaggcat tttcttagaa aagagaggta ggtctctggg 7800 

attccagtat gccatttcca tcctcagtgt ttttggccac ctgagagagg tctattttca 7 86 0 

gaaatgcatt cttcattccc agatgataac atctatagaa ctaaaatgat taggaccata 7 92 0 

acacgtagct cctagcctgc tgtcggaaca cctcccgagt ccctctttgt gggtgaaccc 7 98 0 

3-gaggctggg agctggtgac tcatgatcca ttgagaagca gtcatgatgc agagctgtgt 8 04 0 

gttggaggtc tcagctgaga gggctggatt agcagtcctc attggtgtat ggctttgcag 810 0 

caataactga tggctgtttc ccctcctgct ttatctttca gttaatgacc agccacggcg 8160 

tccctgctgt gagctctggc cgctgccttc cagggctccc gagccacacg ctgggggtgc 822 0 

tggctgaggg aacatggctt gttggcctca gctgaggttg ctgctgtgga agaacctcac 82 8 0 

tttcagaaga agacaaacag taagcttggg tttttcagca gcggggggtt ctctcatttt 8340 

ttctttgtgg ttttgagttg gggattggag gagggaggga gggaaggaag ctgtgttggt 84 0 0 

tttcacacag ggattgatgg aatctggctc ttatggacac agaactgtgt ggtccggata 846 0 

tggcatgtgg cttatcatag agggcagatt tgcagccagg tagaaatagt agctttggtt 852 0 

tgtgctactg cccaggcatg agttctgatc cctaggacct ggctccgaat cgcccctgag 8580 

caccccactt tttccttttg ctgcagccct gggaccacct ggctctccaa aagcccctaa 8640 

tgggcccctg tatttctgga agctgtgggt gaagtgagtt agtggcccca ctcttagaga 8700 

tcaatactgg gtatcttggt gtcaatctgg attctttcct tcaggcctgg aggaatataa 8760 

taactgagac ttgttttatt tctgcagagg gttctaagcc attcacttcc cagatgggcc 882 0 

aataatgctt tgagtaatct ggagatcatc tttaatgcgc aggtgaatgg aactcttcca 8 8 80 

cagagggatg tgagggctgt agagcagagt gaactccctg aaactcagac gtcagctctt 8 94 0 

tgtctctcta tctctgaaca cccttcctta gagatcccat ctctaggatg catttctctg 9000 

tagttagttt ctaagtctct tgttcctgtt ctgcctttat ttttttttcc tggattctaa 9060 

gccagtatcc ccacttggct gtcttaatgt agcttaacat gtctgtaatc aaaatgatca 9120 

tctttctgag attcaaaggg ctataaggga ctttggagag aatttcattc agttttcctc 918 0 

aaactagaat aatgcttgca ctgtctgtaa aagaacaaaa gtgtcaaagc atccttttgt 924 0 

tcactaaatt tcctttttta ttatagtgtt acttaaatat taggaagtta aaagtaggta 93 0 0 

taaacttctt ataggctgtt attatacaac tatatgaccc atacatattt acaaattaag 9360 
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tgcagccaaa attgcaaaat caataccatt caaattaata ccttaaatgt ggtgaggcag 942 0 

ctgttgttca actgaaacca aattataagt tgcatggcag taaatgctat catgctgatc 94 8 0 

attttgagtt tggccagtct atattatcat gtgctaatga ttgaattctc cacccatttt 9540 

tctacttgta tgaccttaat ttgatggcac ctgttccatc ctcatgagtt tgctacaatt 9600 

atactggtgc caacacaatc ataaacacaa atataaactt gggctttgaa atcttgtgcc 9660 

agaacttggc tttaaagtaa gcatttaaaa aatccatatg tgtttattag actttgttta 972 0 

gatgactgtt gaaatgaaaa caaagtgttt aaaatcctct tagagaactt aaatataatc 978 0 

cctcagcaat atgtatacag atcttccttt gagaaaaact gattgtgttc agcctctcat 9840 

gttacaaatg gggaacctga attctgaggt ctctagtgag agaacaggga ctggaatctg 9900 

tggatcctat ctgttttaat aataattgta aagtataata gataatatta tattaaaaag 9960 

agagnnnnnn acacttagaa tgagcttcca tgtgtgaggc actaactgat taggcattat 10 020 

taactagatt tattcctttt aaggccccgc gatgtactgt tatttccaca tgttgtagct 100 8 0 

ggggaacgtg ctactcagag aggttaagta acttgtctga ggtccacacc actaacaagg 1014 0 

agcacaggta gggttcaaat ccagataatc tgactttgga gctggcactc taactcaatg 102 0 0 

tgcctaatcg cttttcagtg gtgtcattat tttgcctatt ctccatctga gaatattgaa 10260 

gtttctgact ccttccttgc ctttctccct gcctcccgtg gttatcccca ggtcttggtg 10320 

ttccagtcct ctatgtccgt ccttactctt attcctttgc tacagtgtga tccagggctc 103 80 

ctgcccttct tatcctggta gagggggccc acttgctggg aaattgtctc cgccatggtt 1044 0 

tatccatgtt gtgtgtccat tagtgagtag tgggaagaat catatcatgt tggcaatgaa 105 00 

aggggggcta tggctctggg gtagtctagt ctgaactctt atttt 10545 



<210> 15 

<211> 4736 

<212> DNA 

<213> Homo sapiens 



<400> 15 

cttttttttt tttttttttt tttttttttt tgaggtgaag tctcactctg ttgcccaggc 60 

tggagtgcaa tggagcgatc ttggctcacc ccaacctctg tctcctgggt tcaaacagtt 12 0 

ctcctgcctc agcctcccga gtagctggga ttacaggctc ccgccaccat gcccagctat 180 

ttttttgtat tttcagtaga gatggggttt cacccttttg accaggctgg tcttgaactc 240 

ctgacctcat gatcaaccca cctcagcctc ccaaagtgct gggattacag gtgtgagcca 3 00 

ccacgcccgg cctcataagt attttctaaa tttatttaca gtcatgccat ttaaaaggaa 360 

agttgtattc ctgtctttgt taatatttat aagtgatttt attcagctac aagcttggaa 42 0 

tggcatataa ttttgtattc tgcttttttc acttaatatt acatggctaa tgatttctgt 480 

gtttcataaa cattattctg atgatggcat gatatattgt tgagtacatg taccataatt 54 0 

gaatcatttc cctattgcta tgcaattaag ttgtttccaa tattttgcaa ttataatgtt 600 

tcaatgaatg aataacttta tgcatatagc tttttgatat cttaagttca gtttcctagg 660 

atgaatttcc aggaatagta attgggcaaa tgggataaac atgactcttg aatacgtatt 72 0 

gttaacattg ctttcccaaa gggctcaact gatttatatt tccgtgttca ttatctttta 780 

aaccagctca tttactcacc aaacattttt aaagccatta tcatgtggta ggcttagtaa 84 0 

gaagaaagtg accctaaggg agaagcttat atataaatag ggtccctggt gtaccaagtg 90 0 

ctgatacaga cacaaagtac ctggggaaat tgagatgagg gagtcctggc tcagctggga 960 

gaaaagttca ttttcataga gtcatggttt tgttctttgg cagaaagaaa attgctttct 102 0 

tccccacccc cacccccagc tttattgagg tataattgac aaataaaaat tgtatatctt 1080 

taagatatgc aatgtgatat atatgtatat ctcaacttaa aaaataagct acagaataaa 114 0 

aaggtgtttg ctattaaaaa aaaagaaaag gctgaatgtc attcccaagc ttggaaattt 12 0 0 

gagtatgttg cctctttggg attatttaca gaaatattag caagaccagc cccatctttg 12 60 

gtcttgagta ctccactgtc agcatgcttt cttccagaga gggatccatt tgcctttatt 1320 
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tttcattctg ttgtgccgtc tatgcaaact attcttgata gttttatggt aacagtgttt 13 80 

ttttgttcca tgagataaat ttatacatgc tcattgtgga aaatttagaa aagacaggaa 1440 

agtattaaaa acatcmcytt tttttttttt tttttttttt tttttttamg cagacagagt 1500 

cttgctctgt cgcccaggcc ggagtgcagt ggcgtgatct cagctcacag caacctccgc 1560 

ttcccaggtt taagtgattc tcctgcctca gcctcccaag tagctgggag tacaggcatg 162 0 

caccaccacg cccggctaat tttgtatttt tagtagagat ggggtttcac catgttggcc 168 0 

aggctggtct caaactcctg acctcaggtg atccgcctgc cttggcctcg caaagttctg 174 0 

ggattatagg caggagccac tgcgccagcc acacctacgt tcttatcatc ctagtacatc 18 00 

cactgtcatt atcttgctgt atttccttct gcccagtctc actctgatca tgcagtggcg 1860 

tgatcatgca gtgatctcgg ctcactgcaa cctaggcctt ctgggttcga gtgattctcc 192 0 

tgccttagcc tcctgggttc aagtgattct cttgccttgg cctcccaagt agctgggatt 198 0 

acaggcatac acccccatgc ccatctaatt tttgtatttt tagtagacac agcgtttcac 2 04 0 

taaaattttg tatttttagt agagatgggg tttcaccatg ttggccaggc tggtctccaa 210 0 

ctcctgacct caggtgatcc gcctgccttg gcctcacaaa gtgattacag gcatgagcca 216 0 

ctgcatccat cgccaaaaag attttttaaa agagtttaat gtagaaccat atcaaaggtc 22 2 0 

tttggaaata aaaaacagtt ttttaaaaat atcagaaata aaacaacaaa taaataaata 22 8 0 

aataaaaaca cccaaaacaa tctgaagcac gagcacctag cagaaaggtt caattatgat 2 34 0 

ctattcatag agtggaatat caagtagaca ttacaggaca tgttttaaga ttatatttta 24 0 0 

tgtcatggga aatgctctcc cagtatgatg ttaaatgaaa aaacagaata caaaagtata 24 60 

tatgctgcat agtctcaata ttgtagagaa aaaatattat ttatgtatgc atgaaaaaag 2520 

acaaaagatg ttaacagaga tccattgtta cttcagttta ctagggattg tctctgggag 2580 

gtaggattaa ggtgatttat atttaccttt ttaaactttt ctgtattttt ttattttcaa 2640 

attttccata aaaatataag gacttgaaga tcaagaaaaa atttctgctt tggctcagtg 2700 

cagtcgtcac gcctgtaatc ccagcagttt gggagcccta ggggagagga tcacttgaac 2 760 

ccaagagttt gacgttccag tgagctatga tctccggatc gtaccgcctg gacgatggag 2 82 0 

caagaccctg tctcaaaaaa aaaaatcttt gctttttttt tttgtttgtt tttgagacgg 2880 

agtctctctc tgttgcccca gctggagtac agtggcacaa tctcagctca ccgcaacctc 2 94 0 

tgcctcctgg gttcaagcga ttctcttgcc tcagcctccc aagtacctgg gattccatgc 3000 

acccaccact atgcccagct acttttttgt attttcagta gagacagggt ttcaccatgt 3 060 

tggccaggct ggtctcgaat tcctgacctc agctgatcca ccggccttgg cctcccaaag 312 0 

tgctgggatt acaggcatga gccactgtgc ccagcccaat cttttgcttt ttttaaaaaa 3180 

agaagacaaa aagggatttt ataccagtat tatcttggct gtgtgactct gaagccacag 3240 

ttgtaagtta taattactct gaaacacaag gccctgtgac tcttttgggc tctttggtgt 33 00 

ttatcttgat tacaacgttg gaatatagaa atgaaaggaa tgggagaggt gatagacttc 33 60 

aggcagtgta actagttgtc tgaacactac tggctcaatt atattgtgtc tagtgatttc 342 0 

catcttgtcc gtctgctaat ttatcgcctg gtaactcact gaggcagggt tttcctttgg 348 0 

agaaacctca ttgttttaac cagtgtatca tgcttgttta gaagttcaat gatcttttta 3540 

actcatcgga gaagatgatg accagacctg gacagatggg gaaggacttt gcactctctc 3 60 0 

tttacagtcc tgagtgcaca caggtcaata tggaactatg tgtgaatttt cattgtcttt 3660 

gagagccctc ttctctgccc catagggagc agctttgtgt gcaattagag gagcaagggt 3 72 0 

tgtgtgtatt tagcacagca ggttggcctg gtcctctcct ctcaacatag tcaccacata 3780 

cctggcacta tgctaaggct gggaatgcag acagatgggt gcctgctttc agagtgctca 3 84 0 

atgtgctgag gaagccagca acagaaacag atgatttcag gagctccagg aaaatgctac 3 900 

^99^9SSLgtg tgcctgggtt actggagtag cacaggagga gggcttctag ctcaggctga 3 96 0 

gattttagta aaggaaatta tgccacgatg aatcctgaag aatgaataga agtgaaccag 4 02 0 

ataaagcacg ataggaagca tcttccctta cctaagggaa gacacagagg tatatggaat 4 08 0 

ggtatgttaa aaggttggga ctccaaacag ttctgttaaa gcttagagag tggtgggaga 414 0 

gactggagaa gttgattaat tagtaaatga agttgtctgt ggatttccca gatcccagtg 42 0 0 

gcattggata tccatattat ttttaaattt acagtgttct atcttatttc ccactcagtg 42 60 

tcagctgctg ctggaagtgg cctggcctct atttatcttc ctgatcctga tctctgttcg 4320 

gctgagctac ccaccctatg aacaacatga atgtaagtaa ctgtggatgt tgcctgagac 43 8 0 



16 



tcaccaatgg cagggaaaat ccaggcaatt aacgtgggct aaattggact tttccaaaga 444 0 

tgctgtcttt gggaaacatc acacatgctt tggatcagaa aacctaggct tctaatttgt 4500 

tgataaggca tgaactcagg agactgtttt cagtcctagt gaatggtgat aattgtaatt 4560 

ataacagtag acaacatctc ttttacacat tttaaatcat gaaaatagaa taaccttact 462 0 

gataatttta gaaagtggtg attaaaagca catttaagat aatgccttaa cacctagtct 468 0 

tttccatatg catgatgtct taatcacaca ttgcaaatca tggaacacag aatttt 473 6 

<210> 16 

<211> 4768 

<212> DNA 

<213> Homo sapiens 

<400> 16 

atcttacaat cacagtcttt ctcttagggc tgggctcagt gggtggattg acactgcaga 6 0 

aatggccaga tctaaaggat caacatttac gtagctggga aatgtagctg ggacttcagt 12 0 

ttcactgccc tagtgatttt tcctaccact aagcagctca gtccataccc ctacgagacc 18 0 

cacaagctta tgagatactg ttcttccagg aaagcagtgg ggccagggcc accttttaat 24 0 

tgtgtttctt ggcctggtcc catctttctc acaatatata gcaacagtta tttacttgct 300 

gattttctaa tgcacatcac acatagtcat attaaacaca cacacacaca cacacacaca 3 60 

cacacacccc tcaagaaaca ttttctgaga cgtgatttcc tgatttcatc aaaaaagaaa 42 0 

agagcgggcc aggcacagtg ggaagtcaag gtgggtggat cacttgaggt caggagtttg 480 

aaaccagcct ggccaacacg gtggaacctc gtctctacta aaaatacaaa aattagccag 54 0 

gcgtggtggc gcacacctgt aatcccagct actggggagg ctgaggcagg agaattgctt 600 

caacctgcga ggctgaggtt gcagtgagcc gagattgcgc cattgcactc cagcctgggc 660 

aacagagtga gactctgtct caaaaaaaaa aaaaaaaaaa aaagcataaa ctgaaattta 72 0 

tatgcaattt atatgcctgt gagataattc tgttttctct tttggaaccc caaagagatt 7 80 

tttttgattg atgagcaaat acattttaga ttttatttaa gcattatgcc aagcaccact 84 0 

gaagtataag tttcaagggc aaactcagtt ttttcatcta ctagacgaat gattttctgg 900 

aatgattaca agcaggcaag atggtgtagt ggaaatagca aatgtcttcg gcatcagaca 960 

agttggggtt tgtttgtatc ctgcctctgc ccttcaccga ggttgtgatc ttgggcagat 102 0 

tgttgagttt taacctagat tcctctgact ccagatcata aattttcaga aaagttctga 10 8 0 

aattcttgta tatactgatg gtaaatgaga cttttcctta catctatgca cttctttgtt 1140 

tgtttgtttt gagatggtct tgctctgttg cccagactgg agtgcagtag tgcaatctcc 12 00 

gctcactaca atgtctgcct cccaggttcc agtgagcctc ctgcctcagc ctcccaaata 1260 

gctgagacta caggcatgtg ccaccacgtc cggctaattt ttgtattttt agtagagaca 132 0 

gggttttgcc atgttgacca cactggtctc gaactcctgg cctcaggtga ttcgcccgcc 13 8 0 

tcagcctccc aaagtgctgg gattacaggc atgagccacc atgcccggcc atatccatgc 144 0 

acttcttgca accttacctt cttttctcat caccctccag ggacctagtt ggaagagcag 1500 

agttaaaagt taaggtgaaa cttggagagg tgtcttgtcc ctaggaacaa aggactggtt 1560 

tgaaattctc tgtaaatctt ccccagttca aaccagagtt atcaaggtct taaaaacttc 162 0 

cctgggtcct gagagcccat tatattattt acttgtcttc ctgtacaccc actgcctagt 1680 

cctgatccta cttttgtttg caaataggat ggggcacaac gtacaaggaa gggcctttgc 174 0 

cacccctgct aagggataac ctgaaatacc ttcaccatca ctgccctgtg ctgcttttca 18 0 0 

cctatgccag tctgtctaca gtgccagtgt ctcctggcat tgaaagggga gaatcttttg 1860 

gtcctttgag tatttggttg ggttacataa atctccctga atgaagagca gctgacttag 192 0 

gcaaggggcc ttgtttggtt ttccttgaac tattaacagg aagataggga gattaactgt 1980 

gtaaatgttc aataggccag agtccctgca gagggtggcc acagtgatca gatcttatca 2040 

catccttgct ttgggtgttg cctctctggt tggagtatgg atagaaaaga aagaaagacc 2100 

ctatattgaa atgcaaagtg cagcaagtcc tgactttgga ttaacttctc agcccatttg 2160 

catgaaaata aaaagatgaa taaaacaagg ttcccacttt ggagggaggt ggtagctgtg 222 0 

^gatggaagg agtgttcctg ctgggcaaca gcagagtaag tgctggggta gattcactcc 2 2 80 
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cacagtgcct ggaaaatcct cataggctca tttgttgagt ctttgtccta caccaggcac 2340 

tctgcaaaaa cgctttgcct gcaaggtctc atgcgatgct caccacagct ctgtgaagtt 24 0 0 

aattgtactt ttatcaccat tttacagatg agaaaactga gggtatgggg tcaatgactt 2460 

ggctaaagtc actgcttagc aagctgcagg gactggatgt gaattccaat tggtttgact 2 52 0 

ccaaagcctg tgaagctact tgttcttcac cacctagagc tgtggttctt gataactgtg 2580 

aactcttttg gggtcacaaa tagccctgag aatatgatag aagcaggagc tctggccttt 2 64 0 

ctgtccatac ctgaacaggt ccttgggtta agagcccctc gtccagggcc tattaatctt 2700 

gatcctcata agcagcatcc atgtattacg gccgcaaacc aaactgtgcc agaccgaatc 2 760 

ctaggaccaa gcccaaatat gtcccatcat ccttttggta agaagctcat tgtaagaaag 2 820 

aaagaggaga gcaagaggat gacctagtgc atggggcctc attgttttaa ttagtgacaa 2880 

aacaacaata ataacaacaa aacccccgaa gcttcacaga tgacatcaga ccccaagcct 2 94 0 

gtgtgttttt caggtgccct tgaggagctt tgtagctggc agaggaggtg aaactgacaa 3 00 0 

atgtttggca gatggaggag agtaccagag gggtttgaga tgagctaaat tccaatctaa 3 060 

ccgcagtgtt gaggaagagg cttggattgg gaccatggag atgggggttc tactcccagt 312 0 

cacgccagct gactttgcga gtgttctttg tcagtcactt tatcttattt tatttatttt 3180 

tatttttttg aaatggagtt tcgctcttgt cgcccaggct ggagtgaaat ggcgcgatct 324 0 

tggctcactg caacctcccc ctcctgagtt caagcgattc tcctgcctca gcctccagag 3300 

tacctgggat tacaggcgcc tgccaccaag cccatcgaat ttttgtatgc ttagtagaga 33 60 

cagggtttcg ccatgttggc cagggtggtc ttgaactcct gacctcaggt gatccgccca 3420 

ccttggcctc ccaaagtgct gggattacag gcgcgagcca ctgtgcccag cccacttcat 3480 

cttaccgtag ttacctcctt agagtatgaa aaaataggct tagggcatcc ccaagtcccc 3 54 0 

tctatgtctg agagctgagg ctggctgtca aagaggaact aaggatgcca gggactttct 3 600 

gcttaggacc cctctcatca cttctccaac gctggtatca tgaaccccat tctacagatg 3660 

atgtccacta gattaagaat ggcatgtgag gccaagtttc cacctgagag tcagttttat 3720 

tcagaagaga caggtctctg ggatgtgggg aatgggacgg acagacttgg catgaagcat 3 78 0 

tgtataaatg gagcctcaaa atcgcttcag ggaattaatg tttctccctg tgtttttcta 3 840 

ctcctcgatt tcaacaggcc attttccaaa taaagccatg ccctctgcag gaacacttcc 3 900 

ttgggttcag gggattatct gtaatgccaa caacccctgt ttccgttacc cgactcctgg 3960 

ggaggctccc ggagttgttg gaaactttaa caaatccatg taagtatcag atcaggtttt 4 02 0 

ctttccaaac ttgtcagtta atccttttcc ttcctttctt gtcctctgga gaattttgaa 4080 

tggctggatt taagtgaagt tgtttttgta aatgcttgtg tgatagagtc tgcagaatga 414 0 

gggaagggag aattttggag aatttggggt atttggggta tccatcacct cgagtattta 42 0 0 

tcatttctgt atgttgtgaa catttcaagt cctgtctgct agctattttg gaatatacta 4260 

tatgttgtta atgatatcat gcagcagacg tgcatctgaa tgggctggct ctaggagcta 4 32 0 

gagggtaggg gctggcacaa agatgcatgc tggaagggtc cttgcccata agaagcttac 43 80 

agccaaggct aggggagttc tgtcttctct gcatcaggtc acctctctca cctctgtcac 444 0 

tgccccatca gactacaatg tctgcaggtc tttctcccct gagtgtgagc tccctgagca 4500 

aagcaggatg ctgccccttc cctttgtatt ccttgctcct tgcttcagtg cctgtacata 4560 

agtatgggca taataagtgt cccccaaatg agacattgag gattcttcaa atgcacagga 462 0 

ccgtgatgtg agttaggacg gagtaaggac gatgggatgt ggctcatgac aatcctgagg 468 0 

aagctgcagc tgcggcacgc agggccacac tgtcatgttc atggacccta gactggcttt 4740 

gtagcctcca tgggcccctt ccatacac 4768 

<210> 17 

<211> 1295 

<212> DNA 

<213> Homo sapiens 

<400> 17 

tcatgactgc cattggtata aagatgaata taatccagac cagattcatg attattcata 60 

catttttagt gtattaactt ttaattctgc ttttaaaata aattaaaaca ttctaatatg 120 



18 



cccttaagag tatcccagcc caggccactg agcctactgt ggttcatgga taagtttgcc 
cctgggggca tgtgtgtgca tgcatgtgtg tgcacatgca tgatgagccg ggccttgaag 
ggtggtaaga tttgggtgtg tagaccaatg gagaaaggca tttggggcag tgatgatggg 
tgggggaggg aacatggtga tgaatggagc tgggtgtggg gagccatggg agtgggttag 
ggccagcctg tggaggacct gggagccagg ctgagttcta tgcacttggc agtcacttct 
gtaaagcagc agaggcagtt ggcctagcta aagcctttcg ccttttcttg caccctttac 
agtgtggctc gcctgttctc agatgctcgg aggcttcttt tatacagcca gaaagacacc 
agcatgaagg acatgcgcaa agttctgaga acattacagc agatcaagaa atccagctca 
agtaagtaaa aaccttctct gcatccgttt ataattggaa attgacctgc accagggaaa 
agagtagccc aggtgtctgg ggcttgttcc cattagatct tccccaaggg gtttttctcc 
ttggtggctg gcctgtgggg cccctctcca ggaggcattg gtgaagaaac taggggagct 
^gttgccaca gacagtgatg tactaatctt ctctgggaag acagaagaaa agtccccagg 
gaagaatact acagacttgg ccttagggac agctaggggt gcagattgct gccaactgca 
ttttttctga agttggccat atggttgcag tgaatggatt tatagacaga gtatttctgt 
gcatataaga gcaattacag ttgtaagttg atatggataa gtgaaagtta agcacttctt 
tctaaaaaga gaatgcaatt cattttcccc taatcatttc aattagtctg atgggcattt 
gaacttgttg tctttaaaaa gtgaaatctt tacctctgat ctggtaagta tccaggcaat 
ttcttgtgtg ccacccagga ggtatctggg gagtgggcat tttctgactg aggcattggc 
tgccatagca tcagagcagc cttccaggca gtggcctggc aaggggacag aggctggtgg 
gagcagctgg ctgagtgcag ccagtaatgg catgt 
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<210> 18 

<211> 2188 

<212> DNA 

<213> Homo sapiens 

<400> 18 

agctctccag gtgattctga tgcatactta agtttgagaa ccattgcttg ttttgcatta 60 

aacaggagat tagtctctgc agcttgtggg aataaagctt taaatctctc caattttagc 12 0 

tctgtgaaaa ggcagtgggg agacaggaat gaacggacta gtgccacaaa gctcaggtgg 18 0 

ggtgggtgag atcatttaga agagaaagac cgggcatggt ggctcacgcc tgtactgtca 24 0 

gcactttggg aggccaaggc aggttggatc acaaggtcag gagtttgaga ccagcctgcc 3 00 

tatcatggtg aaaccctgtc tgtactaaag ataaaaaaaa aaaaatttgc cagtcatggt 3 60 

gatgcatacc tgtaatccca gctactcggg aggctgaggc aggagaatct cttgaacccg 42 0 

ggaggcgggg gttgcagtga gctgagattc caccattgca ctccaaccta ggtgacaggg 480 

tgagactccg tctcaaaata aaaaaaaaaa aagaaaagga aaggctgtgt gtgtgtgtat 54 0 

gtgtgtgtgt gtgtgtgtgt gtgtgtgtaa cagcaccatc acactgtttg agttgaggag 6 00 

cacatgctga gtgtggctca acatgttacc agaaagcaat attttcatgc ctctcctgat 660 

atggcgatgc tcccctatct cattcctgtg tgtgtttagc caggcaactg ttgatcatca 72 0 

atattatgat aacgtttctc cactgtccca ttgtgcccac tttttttttt tttttgagtt 780 

acttactaaa taaaaataaa acactatttc tcaatagact tgaagcttca agatttcctg 84 0 

gtggacaatg aaaccttctc tgggttcctg tatcacaacc tctctctccc aaagtctact 900 

gtggacaaga tgctgagggc tgatgtcatt ctccacaagg taagctgatg cctccagctt 960 

cctcagtagg gctgatggca attacgttgt gcagctactg gaaagaaatg aataaaccct 102 0 

tgtccttgta atggtggtga aggggaggga ggtagtttga atacaacttc acttaatttt 10 80 

acttccctat tcaggcagga attgccaaac catccaggag tggaatatgc aacctggcgt 1140 

catgggccag ctggttaaaa taaaattgat ttctggctta tcacttggca tttgtgatga 12 0 0 

tttcctccta caagggatac attttaagtt gagttaaact taaaaaatat tcacagttct 1260 

gaggcaataa ccgtggttaa gggttattga tctggaggag ctctgtctaa aaaattgagg 132 0 

acaggagact ttagacaagg gtgtatttgg agacttttaa gaattttata aaataagggc 13 8 0 

tggacgcagt ggcactgagt tgagaactgt tgcttgcttt gcattaaata ggagatcagt 144 0 
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ccctgcagct tgtgggaata aggctttaaa tctctccaat tttagctctg tgagatggca 1500 

ctggggaaac agaaatgaac ggactagtgt cacaaagctc aggtgggatg gacgagatca 1560 

cttcaaaggt ctgtaatccc acgtctataa tcccagcact ttgggaggcc aaggcgggaa 162 0 

aatcacttga ggtcaggagt tcgagaccat cctggccaac aatgcaaagc ctgtctctac 1680 

taaaaatatg aaaattagct cagcgtggtg gcatgctcct gtagtcccag ctactcgtga 174 0 

ggctgagaca ggagaatcgt ttgaacctgg gaggcggagg ttgcagtgag ccaatatcac 1800 

gccattgcac tccagcctgg ctgacagagt gagactccat ctcaaaaaaa aaaaaaaaaa 1860 

aagaatttta taaaatcagg aaataatatt agtgtttatg ttgaatttta actttagaat 192 0 

catagaaaac ttcctctggc atcattatta gacagctctt gtgcagtggg tagcaccaga 1980 

cccagcttgc atggttattg atttttcaga gacacttttt gagcttattc tctggcagaa 2 04 0 

^ggggaactg cttcctcccc tatctcgtgt ctgcatacta gcttgtcttt acaagaagca 210 0 

gaagtagtgg aaatgtttat tcttgaaaat aagctttttg cttcacatga tctagaattt 216 0 

ttaaaattag aaaaatgtgc ttactgcg 218 8 

<210> 19 

<211> 1183 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
<222> (1) . , . (1183) 
<223> n = a, t, c, or g 

<400> 19 

agtaaaatgg agaattccaa attctgaaat tgttagaaca tagttctgtg tcttagttaa 6 0 

atatcgacac ttacagataa atagcataaa tgctttctcc ccatatttca gcccagtcct 12 0 

acttaaagac aacataaatt gcaaaatagt gaggatgttg ttcatctaat aaaagtggtt 18 0 

ccaggaattc agactctgga ttcctgtttg ccaaatcatg tgtcccactc ttaagaaaac 240 

gagttggact ntggattttt ctttgcaaga gggacaagag tgtgggagat actgagttaa 3 00 

tgcaacttgc aggttttaag tgtcctgtca ttgtgccttg tgctttgata cattctgagt 360 

ttcagtaaag agacctgatg cattggactg ttgcaatgga acctgtttta agatcttcaa 42 0 

agctgtattg atatgaagtt ctccaaaaga cttcaaggac ccagcttcca atcttcataa 480 

tcctcttgtg cttgtctctc tttgcatgaa atgcttccag gtatttttgc aaggctacca 540 

gttacatttg acaagtctgt gcaatggatc aaaatcagaa gagatgattc aacttggtga 600 

ccaagaagtt tctgagcttt gtggcctacc aagggagaaa ctggctgcag cagagcgagt 660 

acttcgttcc aacatggaca tcctgaagcc aatcctggtg agtagacttg ctcactggag 720 

aaacttcaag cactaatgct ttcggaatgt gaggcttttc cttggacagc atgactttgt 780 

tttgtagaaa agtacggctg gctgggagtt tgtgatataa tttagttcag tggtattcta 84 0 

agtgttctta gtgttctttc agacttttgg gccatctccc aaagggtgaa tgggaagaat 900 

aagctgggtg tggctgagtt taagccaaaa gttttttgtg cttgtttcaa tcagagaaga 960 

cctgcttttt catgttttta ctattataat actaagcaag agctcatttg aaaacagagt 102 0 

tcttcatatt taaaaaaaaa aagtcttgaa accattgatg ggaagatgga tatctattta 1080 

tgtttaaaaa cccatcataa agatgacatt gtgggctgtc acagttggaa ggccctggaa 1140 

ttagatgaga ccacactatt tagcttactt agtaataaca ttg 1183 

<210> 20 

<211> 8981 

<212> DNA 

<213> Homo sapiens 
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<400> 20 

ccgtttggca aatgctcagt aaaagaaaag ggttagaagg ggagaaaggc attttatccc 6 0 

aagccttcag gaatcaggat gaggatgtct tcaccttgtg gtggggagta attatacaat 12 0 

tagagacagc acattggagt gtggctgata tgctgtgtga tgatagctct agctctctgc 18 0 

ctagcagagg aaggacattt caatagaaga aaaagtttaa gaccttgccg agaaacagag 24 0 

aaaggatgtt tgtcttttta agaagttgaa aaccctgttt gcagacaaaa gccctccagt 3 00 

tttggcagta aactttcatg caagggaaga aaaaggcagg ggatgacatt gttgacaatt 360 

gtgaggaatt accatgtgcc aggcactgtg cgaggggctt tgtacatatc ctctagtttt 42 0 

agtgcttata aaaactctgt gatatgtgca cagcatttta aactttgctg catagtcgag 4 80 

aaaatggaag gatggggaat ttgagtcatt tgcccagggt tctatagcta ccccaggttc 54 0 

ccatgactgg agaattgggg cacagggtgg cgggggagag tgagtgacaa gaatcctaac 600 

aatcttattt ccattgagtc cttataaaag aagtggatta actaccacgt ttttaagttt 660 

ttcttaaatt taggttatgt ggatctggcg tttcttgttt tgtcctgggt ttgttttgtt 72 0 

tttgctatgc tgtcttgaac atctgtcatc ttgtaggcct aacggtaaac acaaaaacac 780 

tttacctcct atagctttca attaagatct ctcagtttgt gtttgtaata gttttccagg 840 

caagttctcc ctaggttcgg cttctagtgt gttaaccttt agttataaag tgaacccaaa 900 

gagagaaagt agaaacaaaa cacctcacct gtttttgctc atgaattact ctctatggaa 960 

ggaacaatca tgaacacctc tgcgtatcac agaggcctat ctgagtctga cgtttaaggg 1020 

agaccgcgta ggtccctttg aggactgtga atgtgggagt cctgggactc tggtgaagaa 10 80 

cccgttccag aagagatgaa tgagctggac aagttctttc atagaacctt taggcaggtt 114 0 

ttcttagaaa tgcacattga ggattatgct tggatattgt gatgatcaga atgatactca 12 0 0 

atcccttctg catttggaat tctctttgaa agaaaacatc ccaggcagct atttctcaga 12 60 

gatagtgagt cccagccact tctagacatt ttcttgtgta gtctacatta taatttcaca 132 0 

gcagtctctg atatgacaaa tgtcaaaata gcccaacctt ctctaaactt cagagatgtc 13 8 0 

tgatatgata ttgaataaaa caatgctcat agaaacatca agaaaggtgg attttccctg 144 0 

gatacttttt tcctgcttga caaataacag tgaagaaact gatctcacgt ctttttctct 1500 

ttggaagcct gaacactcag aacccaactt gaggctcctc agctatagca attctgactt 1560 

cacagtctgt aaattattgt tctttttttt ctttagctta tgctttctgc cctaatttat 162 0 

cttttccctg ttctaatgaa ttattgtcct atatctgctg tgcagttagg tgacatataa 1680 

cagcaattaa atatatgaat tggtacatat aaagatttga ctaaaactcg atgtaaaaat 174 0 

aagtgttcta cattcaattt ccagtgttag aaacagtgct gacttgaaca gagtgacaga 18 00 

attccatctt tccctatttt tgacagcttt aaactttata ttttcttcct ttcttgtgag 1860 

ccgtcattaa cttgtttctc aaagccattc ccgtattacc catcttgcag acgcagacag 192 0 

atttgggaat ttgcggtcag agttgtattg gacacatccc cccagcccac atgagatcct 198 0 

tttaatctat tgcatattaa ctagttttaa gtacaatatt cctacttcat ttaaaaccat 2 040 

taatcaaaga atgagtttga aaatgaacaa aatgcaaact tacagttaga aataattgta 210 0 

gtgtctttag ttttggttag gagtcggttt cttgtttgtt aaactcaaga ttgtgaacag 2160 

ttttaattca cttgtttatt tccaatagag atttcaggtt tacatttgaa ttcagaaaca 2220 

aagttttctt tctcattaca gagaacacta aactctacat ctcccttccc gagcaaggag 2280 

ctggccgaag ccacaaaaac attgctgcat agtcttggga ctctggccca ggaggtaagt 2340 

tgtgtctttc cagtaccagg aagcggatca tccactgtat cagtattttc attcctgagt 24 0 0 

ctggcaagag gtccttttga gttgaatatc acatgggatg taatatcaat tttcaaagta 246 0 

taagtgatgt aaacaataat gttttgattt ccttatttta gaaatgaaga aacctaaaac 252 0 

tcatagatgt ctcagagcta attggttagt ggctaacagc tggatatcta gtttagaacc 2580 

ttctccattt tttctttttg cccctaggta atcatacatt tgtaaagagg agaattatct 2 64 0 

ctgccactgc ccatgcactg cttttgtctg accagcaatt tctccatatt gcttcttcag 2700 

tagcaaggcc aatcatttta ccaacacaca tgcttgctaa ctaacaggaa taacgtggta 2 7 60 

cccctaattc agccctttcc cttgaaagca tctggcttct gaggttcaac tatgggaata 2 82 0 

tggtctctta atgaacatta agttgagttt gccttttagg tccacatgtt gacaaatgta 2880 

tcagagtaat ctctgtccta ggatcagagg gcctgtaggc acttgcaaaa gcagttagct 2 94 0 

ctgactccca gccagtgcac actccacctt tctgactccc agccttgtct caaattaggc 3 000 
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ttggaagcga ggaactgtct ggtgtccccc agcataggaa gctgagccag ggggcagtgc 3060 

tcacaaacaa tacagacttt aacgtgtagg atattggaaa ataataattt gtggggaaat 312 0 

tgtctcagac ttggtccacc cttattttta gctgcttctc taatccgttt ttcttttttt 3180 

ggtgcttgta tctaacctac ccattttttg gtgcttgcat cattttttca aatatcaaaa 3240 

acgaacttta tgttttctaa caatgaaagt attgcatgtt cattgtggaa aatgctgaag 3300 

acttggaaaa tacaaaaatg ctgagatcaa acactattga tacgttagtg tatttcttcc 33 60 

tgtcctgttc tactttcttt ctttgaattc tgctcacgtg tttctgactg atgaggtctg 342 0 

acttttgggt tccttttcca gaggagaagc cttctttcag cttgccattt gttaccctgg 3480 

ttatgaaggc tggtaacctt ttttactagg tagagaagct ggaccaactg gggttcttcc 3 54 0 

^gggggag^a tgagaaagag aaactgtttt gcaagtccgt agctatttct ctagggccct 3600 

gttagctgac attgacatgc cttgcattgc tctgcagatc ccctcgcagc cctctgtccc 3660 

ttgttcattt ctggccttag agaaagcaaa gcagggtctg taacagggga ggctgcctct 372 0 

aaactcaggg tttggttaca gctgttttca cttacatcac tggccctggt tttttttttt 3780 

tttctggcat taaaaaaaaa aattggaagc aggtgatgtt cccattgctg atgtggtgga 3 84 0 

aactctccaa gtgaacaata tacgtttttc ttggcagctg tttcttgtgc cctgcttgct 3 900 

cctggtccag gacaagcaag gaccatctgc ctctttcaat agaacacctc cagatccctt 3960 

tgatcaaaag ttactcattg tctgacttgc tatttctgtg agataaatgg gagaagatca 4 02 0 

ataaatgcac ttgtttgtcc agtcagcgtg tggaaagttg ataattttga ccaaagcaca 4080 

accctgaaag gaaaagaaaa agggagtgaa tgtcttctga gaagctgcct aggttcagac 414 0 

agtgtcaccc atttccctgt atgctccaca tgacaaacct gagtgggtct catcatgtcc 42 00 

attttgcaga tggcaccaag gctcagaaag gttaggcaac ttttccagtc acccaatgag 4260 

ttaattgaca aaactgggat tcaaacccag aactgttgga ttccaaagcc tgtgttgttg 432 0 

cctgcttcgt gaaaaactcc agtagcgact ggaatagaaa ggagaacctt ccaagaaaga 43 8 0 

aaatacgcac tagcagaacc tggaaattgg gaggaaatga ggacttgagg aataagatga 4440 

atgaaagctg acctgagttt cacatctggg tgatgggaag ggaggacagg gaggcagcat 45 00 

ctcagatgtc cacccagcac cgaccagctg cctggcattg ctaggtgttg aggactcagc 4560 

agtgaacacg ctaacttctc tgctttcttg gggcacgtat agggtgagag acagaaacaa 4 62 0 

acaggtcagt gtacaatgcc acaggaggga tatatgcagt gaagaaaaag cagggtaagg 4 68 0 

ggcatagagc atgagaaggt gcttttttta aaggggktga ttaggaaagc tctctctaag 4 74 0 

gtgacagttg gacctgaagg agatgatagc atgtctgtgg tgagggaagg aaactccgaa 4 80 0 

caggaagaat ggcagataca aagacattga tgctagagca tgcctaagga atgtgtttaa 4 86 0 

ggaccaggga aagtgagcaa gtggtggggg gaggagagga gctcagagca ggaggaggtg 4 92 0 

agtgccatac aggcctggca agactttgga ttcctgctgg gtgagatgag aatccagcgg 4 980 

agggcttgag ggaggggaca tgatgtgatc tagagtttag actgtttaca ctctggttgt 5 04 0 

tgggttgaga agagactggg atgggggaaa gggaggacaa aggacattgt gctggattga 5100 

gaaagcagta agtcagtttc attcattcac tcaaccgatg atgttcaaat accaccatca 5160 

tccgtgggct aaaggatgaa gagccatccc tccctgagag tcaggaagca cttcccagat 522 0 

aaagtttgga gtgtgagctg aggtgtagga gaaagagtaa gagtttaccc ctgaaacggg 5280 

tgctgggaag agtcaatagt ttggaataac tcaataattt atggtgcttc tttagaaaga 5340 

tttgctggct ttatgtggga agaaatttkt ttttttgatt ggggagtggt gggttggtgg 54 0 0 

tgaggctgcc tgtggaaaga gaagtgagtg ttttgactca ctgttattta aaaatctcta 5460 

gggctgttcc aataagcaac aaaaggcaaa atggcctggt tctctgtccc ctttctgtct 552 0 

gtatgcctcg tacaggttat gaaaagaaaa agttgggaaa agctgtccac ctcacctaat 55 8 0 

tgtgttcttg tggagtgtgc tagatgcccc ctctctggag aaaaaaaatc cttgtggcct 5640 

ctgacccacc tctggagagc ctagttccct tctggaggca gaaggcaaag cttaggacct 57 0 0 

agagagtgct ggaccacgcc actcacagga accagcaggc tgtgaggttg aaagctaggc 5760 

atatggagct ttccaggctg ggtgcagggc ctcgtggccc ttcccctccc ctctgtgctc 5 82 0 

tatagctcag tcttcccagg cggtgtgaac acgcagtgac atttccagga atacagggat 588 0 

ttattaatga tttcttgtga aatgtttgga aatacaaagt actctataaa tatttcataa 5940 

tagcattggg gctgagaact ccacaaagtg ccggaataca tttgcatgta agacagaacg 6000 

ctgcctgggt cattgatgcc tgttgagtgg cagtcacaga cactgcctag ggtttctgac 6060 
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tcacgctgtt gggactgttc tatgcagggc accctcttgt gtggcatagg atttgtgcct 612 0 

caccacacac tgttgtagct ttgctgtctt gatgatgagt agagggcagt gtccaggcca 618 0 

tggtataagc atctactgcc ccccagggtt accaaaacca agccaagttg tgtctcagcg 6240 

agctccgtga agcatggaga agttgagtac tcagagacat gacgtgactt ttcaaaggct 63 0 0 

gtaagctgac gagggacata gctagggttc agacttgagt ttttcttttt ctttttcttt 63 6 0 

ttcttttttt tttaagactg agtcttgctt ttgtcgccca ggctggattg cagtggtgct 642 0 

tggctcactg caacctctgc ctcccgggtt caagcaattc tcctgcctca gcctccccag 6480 

tagctgggat tacaggcacc tgccaccatg cctggccaac atttttgtat ttttttagta 6540 

gagatggggt ttcaccatgt tggccaggct ggtcttgaac tcctgacctc aggtgatcca 6600 

cccgcctcga cctcccaaag tactgggatt acaggtgtga gccactgcac ccggcccaga 6660 

ctcgagtttt tcatcttaat gctttttcat tgcctgacac tttactgaga ccaagatagg 672 0 

gaacttcaca tacagtacct tttctcccaa ggcggaagag ggctgttcaa tttctacact 6780 

agagttcggg gagttttaga aatgagtcag ttatcgagga tgagagcagt tcctgatagg 6840 

ctcaaccaca atgagatgta gctgttcaga gaaagcattc ttttatctat aaactggaag 6900 

ataatcccgg tgaaacgaag cccagcccca ggggcttcac taactccagg ctgtgcttct 6960 

caaactttag tgagcatagg aatcacctgg gcatcttgtg aagctgtaga tttgaattct 702 0 

gcaggtcggc agaggggtct cagaatccgc atttccaaca atgtctccag taatgctgat 70 8 0 

gctgctcgtc cctggaccac agattgggta gccaggttct ggcaagctca tcccaaggct 7140 

ttgagatgac atcagacaaa atatgttctg ggacatggct tttgagaggt caagaaaata 7200 

agatgtttct ttctcttctc atccccaacc cttgcactgc ccttttctcc cttcccctac 7260 

cctcctttct gtccccatcc ctgacgccag ctgttcagca tgagaagctg gagtgacatg 7320 

cgacaggagg tgatgtttct gaccaatgtg aacagctcca gctcctccac ccaaatctac 73 80 

caggctgtgt ctcgtattgt ctgcgggcat cccgagggag gggggctgaa gatcaagtct 744 0 

ctcaactggt atgaggacaa caactacaaa gccctctttg gaggcaatgg cactgaggaa 7500 

gatgctgaaa ccttctatga caactctaca agtgagtgtc catgcagacc ccagccctgt 7560 

ccccaacccc atccctccct tagttctggc cttggcctgt gtcatctcct ccctctgtag 7620 

cagcgttaga tgtctacatg cccatttgcc caccagactg agctcttcct agaggagaga 7680 

ggcttctctt gaatagctac ctgtccccag ttctctgaat gcagcctggc acatctcagg 774 0 

tgcacagtag tgtttatcaa tggaatgaat gattgacagc caaccttctg gttttctggg 7800 

ggatgtggaa gggtggcttc cagggtgatc aagaatgaga taatggcaga aggacaaatc 7860 

ctgcaagatc tcacttatat atggaatata tgtaaggtag aaagtgtcag tttcacatga 7920 

tgaataagtt cctgggatct tgatgtacat cgtgatgact atagttagta acactgtata 7980 

gtatacttga aatttgctaa gagagtagat ccgaagtgtt cacactacac aaaaaaggca 8 04 0 

actatgaggt gatggattta ttaacagctt gattgtggtg atccttttac aaagtataca 810 0 

tatattaaaa catcacattg tataccttaa atatatacaa tttttatttg tcagttgtaa 8160 

ctcaaaaaag ctagaaaagc atttttaaaa aggatgatgt actggtctta atattaccat 82 2 0 

tgagataagc tttataataa cataaaaaga aataacagta atgataatag caacaacaac 82 8 0 

aacaacaaag aactaacatt taagtagaat ttcttgtgca ctgtgcattc tgtttaagtt 8340 

atctcatttt accctcatga taacctgcag ggaagattct ttaaccccac atttcatagg 8400 

ctcagagagg ttaagtgcct tggttagagc cacatcagag ttaatccaca agagccagga 84 60 

ttcaagccca aatctgcctg gatctgtgct ctctaagata actgttagtg gtggcgtgtg 852 0 

tgttctcaca ctcagacatt tgatctgccc tttgtttccc attcttagct gcaaggcagt 8580 

gttaaagaac cctgtgtctc catatccact ccccacactt aagcactttt gtgggcccgt 8640 

gtgccgtatg cctcgtggca gcagggatcc aatgtcacag ttttaggcag tggcatcctt 8700 

ttccttgaaa acttgatgca ggggaacctt tctccatttc caaccacagg tgtgtctttc 8760 

agacactgag tgaggcaggt tttgtacttt attgtaacac aagaaccttt tcttctctgg 8820 

agtaaagcac tccagacatt cgcaagttgc tttacaagcc ttaaaaggat ggtattgtag 8880 

gcaactttaa ttaaatccca tctcctcctc tcccccagct tgcaagttga cccaaggaag 8940 

ccttcatttc catgacagac ttaattgtga gggcatcctc a 8981 
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<210> 21 

<211> 20284 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 
<222> (1) . . . (20284) 
<223> n = a, t, c, or g 

<400> 21 

actgtgttag caaggatggt ctcgatctcc tgacctcgtg atccgcctgt atcggcctcc 60 

caaagtgctg ggattacagg cgtgaaccac tgcgccctgt tgagaatttt tttttttttt 12 0 

tttgggagaa agagtttcgc tcttgttgcc cgggctagag tgcagtgaca caatctcggc 18 0 

tcactgcaac ctctgcctcc tgggttcaag caattctcct gcctcagcct catgcgtcac 240 

cacgcccagc taattttgta tttttagtag agacagggtt tctccatgtt ggtcaggctg 300 

gtctcgaact cccaacctca ggtggttcgc ccgccttggc ctcccaaagt gctgggattg 3 60 

caggcatgag ccactgcgcc cagccccaaa ttttggtttt tgcttgaaaa ctgaggtctg 42 0 

aattcagcct tctggttgcc cctcaagagt cagtttaaat gttggtcatg ttagttgtca 480 

gtgaaaacaa tggtgaggct ggcatgagag tgtgaatctg gatgggaggg cttgtgcttc 54 0 

atgaaaacat ttttccagat cagctcagtc gtgagttatc cgtcattgac gttataataa 600 

gctctgatta tttatcaagc atcattcttt atagatatct cagtttaatc tgagataatc 660 

ttctccacat ctctccacat agatgttatg aattttactt ttacagagga gccaactgag 720 

gctcagataa gttacttatt atatgactag tagtggtaga gctggggttt caactaagaa 780 

ctctctggct ccaaagccct tgtaagtttc tatcagtata tgaccatgca tatgagcatt 84 0 

tgtctctcct cttcttcata gctccttact gcaatgattt gatgaagaat ttggagtcta 900 

gtcctctttc ccgcattatc tggaaagctc tgaagccgct gctcgttggg aagatcctgt 960 

atacacctga cactccagcc acaaggcagg tcatggctga ggtaagctgc ccccagccca 102 0 

agactccctc cccagaatct ccccagaact gggggcaaaa aactcaaggt agcttcagag 10 8 0 

gtgtgcgcta agtatactca cggctcttct ggaattccca gagtgaaaac ctcaagtctg 114 0 

atgcagacca gagctgggcc agctccccag tcgtgggtat agaatcatag ttacaagcag 12 0 0 

gcatttcttg gggatgggga ggactggcac agggctgctg tgatggggta tcttttcagg 12 60 

gaggagccaa acgctcattg tctgtgcttc tcctcctttt tctgcggtcc ctggctcccc 132 0 

acctgactcc aggtgaacaa gaccttccag gaactggctg tgttccatga tctggaaggc 13 8 0 

atgtgggagg aactcagccc caagatctgg accttcatgg agaacagcca agaaatggac 144 0 

cttgtccggg tgagtgtccc tcccattatt accatgtgcc tgcttgatac tggagaggtg 1500 

agtttctggt cactttccca ggtgtgagtg aggtgagaat tctttcagtt tatctagctg 1560 

99ggaatgta gtgagcatag ctaaagtcac agggcaccac ctctccagaa gtacaggcca 162 0 

tggtgcagag ataacgctgt gcatatcagc atccatgcca ctcacggtca aatagcagtt 168 0 

ttctgcaaaa cttagtgagg gctggtgttt ggaagtggag ttgagtaatt gcagtaccct 174 0 

attttccttt ttgctgcagc ctctcagcca gccacagcat ctccctgtgt cttggtaggt 1800 

tttggaaaga agtgtgggag caaaagcatg atgttacatg tagactggcc tgagatactc 1860 

attctcaggg cactgtgtga atgatgagct gctgttactg tgtggagggg aaatgcactt 1920 

agtgcttcag agccacttga aagggataag tgctctagag acaattgggt tcaaatgtgg 1980 

agcaggctga gcaagaacag aatgtctcct ttgcctgagc ctgagtgctg ttaatcacat 2 04 0 

cttcctgcct tgggctgagt tagagaatca ttagactatt tcctgtttcc atggtgaggg 210 0 

aggcctcttc cttttgtctc tgctcccctt aagaagcagg tgaggatttt gccaggtttc 2160 

ttgttttgaa ccttattgac tttaagggcg gctgggtttt agagactgta cctacctagg 2220 

gggaacactt ccgaagttta ggactattcc ctgatccgct gggaggcagg ttactgagga 22 8 0 

agtcccttta aaaacaaagg agtttatact gagaaaagca taaacagtga tttgtatgga 2 34 0 

ttcacactga ctaatatagc tcatgccatt aaagtggggt ctcttctcta aaggagggtt 2400 
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atatgatcta gccccgtaga cctaagtgtg gtttcagacc tgttcttcct ggtcctctcc 2460 

ttggaatcca tatttctact agttggactt tttctgtttg tctggctctc agaggattat 2520 

aggaggccct gtgaagtgac tcagtgaatt ttgatttgtg ggcaagtaga tggttcccta 2580 

gtctgaaatt gactttgcct taggtgcttc aattcttcat aagctcccag ttcttaaagg 2640 

acaagatcct tgtaaacatg gcaatggcat tcattaggaa tctagctggg aaaatccagt 2700 

gtgtatgctt ggaaatgagg gatctggggc tggagagaaa ggcatgggca tgccttggag 2 7 60 

ggacttgtgt gtcaagctga ggacctttac tttaagctct aggggaccag gcaaggggag 2 82 0 

atgtagatac gttactctga tggggtggat gaattgaaga aggatgaggc aagaatgaag 2 880 

gcagagacca gggaggaggc tctccaagtg gccaaggcat aaagcaagaa atgaggcctg 2 94 0 

gtgactgctt agtggcagag cagtgaaaga gagggaggca tcaaagtgag tctcgatttc 3 00 0 

tagctgggtg ggtggtagcg atgtccagta ggccagtggc tactgaggtc tgcagtggag 3 060 

gagggtggtt gggctggaga cagatgatga gggagtcatc agcctgtggg tggaagaaaa 312 0 

gggaacctct tccaactgtt ttctttgctt cttccctctc tttctctttt tttttttttt 3180 

tggacagagt cttgctctgt cacccaggct gaaatgcagt ggcatgatct tggctcacca 3240 

cagcctccgc ctcctgggtt caagcaattc tcctgtctca gcctccagag tagctgggat 33 00 

tacaggcaca tatcactgtg cccggctaat ttttgtattt tcagtggaga tgggatttca 3360 

ccatgttggt cgggctggaa tgaactcctg acctcaagtg atccacctgc ctcagcctcc 342 0 

caaagtgttg ggattacagg catgagccac cgcgcccggc ctttcttccc tctcttaaag 3480 

agtgtttatt taattccaca aacatgagct tgtcaccccc tgtagcctgg catctcctac 3 54 0 

acgaggtgat ggctgaggct tctgcttctg ctggggtagc tctgatcttt ctgctttctc 3 60 0 

tggcactgtc tacccatgtt gcctcacccc acaggtccca gggcacctct ctcgggcaag 3660 

tcttggaacc ctctgacact gatttgctct cttttctgag ctgcttttag ccacccatcc 3 72 0 

tcgggacctg ttttctctct gcctccaccc ctgcgggcag tcttaggtct cctgcccctc 3780 

acgagcaccc cagagaggcc acgtgctcag tgatctcagt gggcgcatct ttctagtctt 3 84 0 

gctattcttt ttggccatgt tgttcagaaa ccatactggg cagggccgac ttcaccctaa 3 900 

aggctgcgtc tcttcactct gcttttgttt gttccaaata aagtggcttc agaattgcta 3 960 

accctagcct ctgtgaactt gtgaggtaca attttgtgtc tgttatgtta acaaaaatac 4020 

atacatacct tcctggtgat ggtataaatt gctattctct attggaaagc aatttggaat 4080 

gaaaatttaa agaaccattt taaaatatgc tatcctgcgt acctccattc cacccacccc 4140 

cagggatgta gcctactgaa ataattttaa agaagtcacc atatgagaga aaatgttatt 42 00 

gctatattgt tattgtgaga aattggaaat agactaaatg ttcagcacta taggaataat 4260 

taatgaaatt acatatactc tatacaatca ttatgctgcc attgaaataa taaatacaaa 432 0 

ggcgcaaggg gggaaaagct tataatgtta gtgaaactaa gactgatttt tttataaagc 43 80 

agcagttttc agacccttgg agactccaat tcggtagaac cagagcttca tcttctctgt 4440 

cgaagctgtg acaggagttg caaatgcctc tcctttttgc tgagtttgca gctgctgttt 4500 

ttccggcagc acatctgtgc aggcctctgc ctcggcccct ctggatctgc tgattgagca 4560 

gcggattgat ctgtccttct ctttcgtgtt gacccatgtg aggaaccaac tggcaaggga 4 62 0 

acaagaaatg gaaataggcc tcctttgcat catgacctgt acatcctgca attggaaaag 4 680 

attgtacttt agttggttta accagcagca ttatttttct aaactaagca gtaagaagga 4740 

attaggtttt atgtgggatc aacagactgg gtctcaaaag aggaaggtga tagaacacag 4800 

tggggagggg gaggtgcact agaaacagag ggcctatgct ttcattctgg ctttgctact 4860 

taatagctgt gtgacccaat cttagagact taacctctct gaacttccat tttctcatgt 492 0 

ataaaatggg aaatattaaa ggatactcac tgggctggtg gcttgtgcct gtaatcccag 4980 

cacttgggga ggttgaggtg ggaggatcac ttgagcccag gtgttcaaga ccagcccagg 5 04 0 

caacatggca agactctgtc tctatgaaaa aattaaaaat tagccaggtg tggtggtgtg 510 0 

cacctgtagt cttagctact tggtaggctg agatgggagg atcacttggg cttgggaggt 5160 

caaggctgcg gtgagctgtg attccatcac tgcactccag cccgggcggc agagcgagac 5220 

actgaatcca aacgacaaca acaacaaaag gcaaaaaaat aaaagtgccc tctttatgga 52 8 0 

gttgtgtaag gtgaagcata tacactattc aacatagtaa ctatataaag gaagtattgt 534 0 

tgttgttact gtagttaata ccattaagtg agatgtttcg tatagtggaa agcacatgga 5400 

ctctgaattc agactggtct gactttgagt ctcagctcca catctagtaa tactatgacc 5460 
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aagccctggt taaaatcatg tttttttttc ttcagcctca gtcttctcac atataaaata 5520 

gggacactgt catttacctc agttttctgt gaggataaaa caacgacagt gtatatgcaa 5580 

gtattttgta aattttgtag tgctcctcaa gatttagttg gtgtttacta cttgtacttt 5640 

ctcactggaa tggcagatgc tgttggacag cagggacaat gaccactttt gggaacagca 57 00 

gttggatggc ttagattgga cagcccaaga catcgtggcg tttttggcca agcacccaga 5760 

ggatgtccag tccagtaatg gttctgtgta cacctggaga gaagctttca acgagactaa 5 82 0 

ccaggcaatc cggaccatat ctcgcttcat ggaggtgaat ctgttgctgg gatcatttag 5880 

aaaagactta acggcttctt tctctgagac gttacaataa ggttcaggca ggaggcaagt 5 94 0 

ttagaaataa tgtatagtct catttacaaa actatccctc aagcctaaca caggatttga 6000 

taacaaaagg cacttaataa atgttagttg agtggttgaa tgagtaaata aactctagct 6060 

ttagtaaatt aactctagct tattctatat aggctcaaga gaatatttct acccattttc 6120 

ttctaggttt tcctatctca gtgactaatg gtagcaaagc attcccttaa aaaggcatta 6180 

tttgtgaaac ttayctaaaa tcgaattcgg gtccaattaa atttttgaaa ttttatatta 6240 

aaaattatat tagtagggat gggtaagagg tgttttggtc tggttggttg gttagttgct 63 00 

atgactcaga attgctaaga aaacagaaaa gtaagataag atcattgttt taacctcttt 6360 

tcctccacaa aatcaataaa taacatatcc ctaaattact cttagaattt ctcttaaatt 642 0 

gcagtgaaaa accaaaatcc ttcattcttg gttgaaggtt ggaaaactac gttagagagg 64 8 0 

attagagaga gaggatgagc aatcgtgtag tcagcccttg cctcctagtg taggatttgt 654 0 

ctcagccact gcttgttgtc ctggctgcca acgttctcat gaaggctgtt cttctatcag 6600 

tgtgtcaacc tgaacaagct agaacccata gcaacagaag tctggctcat caacaagtcc 6660 

atggagctgc tggatgagag gaagttctgg gctggtattg tgttcactgg aattactccm 672 0 

rgcagcattg agctgcccca tcatgtcaag tacaagatcc gaatggacat tgacaatgtg 6780 

gagaggacaa ataaaatcaa ggatgggtaa gtggaatccc atcacaccag cctggtcttg 6840 

gggaggtcca gagcacctat tatattagga caagaggtac tttattttaa ctaaaaattt 6900 

ggtagaaatt tcaacaacaa caaaaaaact caacttggtg tcatgatttt ggtgaaattg 6960 

gtacatgact tgctggaagg tttttcatag gtcataaaat aacagtatct tttgatttag 7020 

catttctact caagggaatt aattccagga attttggtgg caggcacctg taatcccagc 7080 

tactcgggag gctgaggcag gagaattgct tgaacccagg aggcagaggt tgcagtgagc 714 0 

taagatcgca tcattgcact cccgcctggg caataagagt gaaactccat ctcaaaaaaa 7200 

aaaaagatac aaaaatagaa aaaggggctt ggtaagggta gtagggtttt gggcaatttt 72 60 

tttttttttt ttttttttta ttgtatggtt ctaaaggaat ggttgattac ctgtggtttg 7320 

gttttaggta ctgggaccct ggtcctcgag ctgacccctt tgaggacatg cggtacgtct 73 80 

gggggggctt cgcctacttg caggatgtgg tggagcaggc aatcatcagg gtgctgacgg 744 0 

gcaccgagaa gaaaactggt gtctatatgc aacagatgcc ctatccctgt tacgttgatg 7500 

acatgtaagt tacctgcaag ccactgtttt taaccagttt atactgtgcc agatgggggt 7560 

gtatatatgt gtgtgcatgt gcatgcatgt gtgaatgatc tggaaataag atgccagatg 7 62 0 

taagttgtca acagttgcag ccacatgaca gacatagata tatgtgcaca cactagtaaa 7680 

Gctctttcct tctcatccat ggttgccact tttatctttt tatttttatt tttttttttg 7740 

agatggagtc tcgctctgac gcccaggctg gagtgcagtg gctcgatctc ggctcactgc 7 8 00 

aacctttgcc tcccgggttc aagctattct cctgcctcag cctccacagt agctgggact 7860 

acaggctcat gctgccacgc ccggctgact ttttgtattt tagtagagac gaggtttcac 7 92 0 

catgttaccc aggctagact tcaactcctg agctcaggca atccaccctc cttggcctcc 7980 

caaagtgctg ggattacagg tgtgagccac tgcacccagc ccaccacttt aattttttac 8040 

actctaccct tttggtcaaa atttgctcaa tctgcaagct taaaatgtgt catgacaaac 8100 

acatgcaagc acatactcac acatagatgc agaaacagcg tctaaactta taaaagcaca 8160 

gtttatgtaa atgtgtgcac ttcttctccc taggtggtaa accacatttc aaaacaaccc 82 2 0 

aaataaaact gaacaaagct tcttcctctt agacttttta gaaaatcttt cagtgctgag 82 80 

tcactaagct gccaagttct cattgtggga actatgcctt tggatgtaat gatttcttct 8340 

aagacaatgg gcggaggtgt agttattgca gacatctgaa atatgtaatg tttcttccag 8400 

attctggaaa ttctcttatt ctctgtggtt ggtggtggtg gtgggatgtg tgtgtgtgtg 84 60 

tgtgtgtgtg tgtgtgtgtg tgtgtaggga tcaggatgcg ggaggagctg ggttctgctt 852 0 
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gtattggttc tctgttttgc attgaatagt gtgtttcctt gtatggctat ctatagcttt 8580 

tcaaggtcac cagaaattat cctgtttttc accttctaaa caattagctg gaatttttca 8640 

aaggaagact tttacaaaga cccctaagct aaggtttact ctagaaagga tgtcttaaga 870 0 

cagggcacag gagttcagag gcattaagag ctggtgcctg ttgtcatgta gtgagtatgt 87 6 0 

gcctacatgg taaagctttg acgtgaacct caagttcagg gtccaaaatc tgtgtgcctt 8 82 0 

tttactttgc acatctgcat tttctattct agcttggaat ctgaaacatt gacaagagct 8880 

gcctgaaatg tatgtctgtg gtgtgattag agttacgata agcaagtcaa tagtgagatg 8940 

accttggaga tgttgaactt ttgtgagaga atgagttgtt tttttgtttt ggtttttagt 90 0 0 

actttaacat aatctacctt tagtttaagt atcgctcaca gttacctagt tactgaagca 9060 

agcccccaaa gaaatttggt ttggcaacac tttgttagcc tcgtttttct ctctacattg 912 0 

cattgctcgt gaagcattgg atcatacgta catttcagag tctagagggc ctgtccttct 9180 

gtggcccaga tgtggtgctc cctctagcat gcaggctcag aggccttggc ccatcaccct 9240 

ggctcacgtg tgtctttctt tctccccttg tccttccttg gggcctccag ctttctgcgg 9300 

gtgatgagcc ggtcaatgcc cctcttcatg acgctggcct ggatttactc agtggctgtg 93 6 0 

atcatcaagg gcatcgtgta tgagaaggag gcacggctga aagagaccat gcggatcatg 94 2 0 

ggcctggaca acagcatcct ctggtttagc tggttcatta gtagcctcat tcctcttctt 94 8 0 

gtgagcgctg gcctgctagt ggtcatcctg aaggtaaggc agcctcactc gctcttccct 954 0 

gccaggaaac tccgaaatag ctcaacacgg gctaagggag gagaagaaga aaaaaaatcc 9600 

aagcctctgg tagagaaggg gtcatacctg tcatttcctg caatttcatc catttatagt 9660 

tggggaaagt gaggcccaga gaggggcagt gacttgccca aggtcaaccc agccgggtag 972 0 

cagctaagta ggatgagagt gcagggttca tgctttccag ataaccacat gctcaactgt 978 0 

gccatgctgt ctcattggta gtggttcatg gcagcatctg aaagctattt attttcttag 984 0 

atatattggg tggcgattct tcctaagttt ctaagaacaa taatcagaag gatatatatt 990 0 

gttgcaggtt agactgtctg gaagcagagg ctgaaataga gtttgatgta tgggtattta 9960 

tgagggctca atacctatgg aagagatatg gaagatgcag gattgggcag agggaggagt 1002 0 

tgaactgtga tatagggcca accccgtggg gcactctaga gaatatgcag cttgttggag 100 8 0 

ttgttcttca tcgagctgaa acatccagcc ctttgtgctc ccccaaggcc tccctcctga 10140 

caccacctac ctcagccctc tcaatcaatc actggatgtg ggctgccctg ggaaggtcgt 102 0 0 

gccccagggc ctacatggct ctctgctgct gtgacaaacc cagagttgct gatgcctgag 102 60 

gccgtctact gacagctggg caacaaggct tccctgaatg gggactctgg gcagtgcagt 1032 0 

tttgtgtctg aaccatacat taatatattt atatccgaat tttctttctc tgcaagcatt 103 80 

tcatataaag acacatcagg taaaaataaa tgtttttgaa gcaaaaggag tacaaagaga 1044 0 

taagaactaa ctaatttaat actagttacc atctgttaca aatagttcct actgattgcc 105 0 0 

aaggactgtt taaacacatc acatgggctt cttcttctat cctcactaac ccttttaaca 10560 

gacaaggaaa tgaggctcag gaaggtcaag gactttattg aggttccaca gtaggataca 1062 0 

gttcttgcta aaagcaaccc ctccctcatg ctctgttatc taactgcaag gggaaggtca 10680 

gtggcagagg tagtggtccc atggttggtg cataagagct gctctgagac aactgcatgc 10740 

tggtgggtcc tgcagacatg tacccatcag ccggagatag gctcaaaata tccacaagag 108 0 0 

tttggatgat tgtgggaatg cagaatccat ggtgatcaag agggaaagtc aagttgcctg 10860 

gccattttcc ttggctttta gacagaaaag ttacgtggga tattatctcc cacagctctt 10920 

ctgtggtgcc accagtcata gtccttatat aaggagaaac cagttgaaat tacctattga 10980 

agaaacaaag agcaaactcg cccactgaaa tgcgtagaaa gccctggact ctgttgtatt 1104 0 

cataactctg ccattatttt tctgcgtagt tttgggtaag tcacttatct tctttaggat 11100 

ggtaatgatc agttgcctca tcagaaagat gaacagcatt acgcctctgc attgtctcta 11160 

acatgagtag gaataaaccc tgtctttttt ctgtagatca tacaagtgag tgcttgggat 1122 0 

tgttgaggca gcacatttga tgtgtctctt ccttcccagt taggaaacct gctgccctac 112 80 

agtgatccca gcgtggtgtt tgtcttcctg tccgtgtttg ctgtggtgac aatcctgcag 1134 0 

tgcttcctga ttagcacact cttctccaga gccaacctgg cagcagcctg tgggggcatc 114 0 0 

atctacttca cgctgtacct gccctacgtc ctgtgtgtgg catggcagga ctacgtgggc 1146 0 

ttcacactca agatcttcgc tgtgagtacc tctggccttt cttcagtggc tgtaggcatt 1152 0 

tgaccttcct ttggagtccc tgaataaaag cagcaagttg agaacagaag atgattgtct 115 8 0 
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tttccaatgg gacatgaacc ttagctctag attctaagct ctttaagggt aagggcaagc 1164 0 

attgtgtttt attaaattgt ttacctttag tcttctcagt gaatcctggt tgaattgaat 11700 

tgaatggaat ttttccgaga gccagactgc atcttgaact gggctgggga taaatggcat 11760 

tgaggaatgg cttcaggcaa cagatgccat ctctgccctt tatctcccag ctctgttggc 1182 0 

tatgttaagc tcatgacaaa ccaaggccac aaatagaact gaaaactctt gatgtcagag 11880 

atgacctctc ttgtcttcct tgtgtccagt atggtgtttt gcttgagtaa tgttttctga 11940 

actaagcaca actgaggagc aggtgcctca tcccacaaat tcctgacttg gacacttcct 12000 

tccctcgtac agagcagggg gatatcttgg agagtgtgtg agcccctaca agtgcaagtt 12 0 60 

gtcagatgtc cccaggtcac ttatcaggaa agctaagagt gactcatagg atgctcctgt 1212 0 

tgcctcagtc tgggcttcat aggcatcagc agccccaaac aggcacctct gatcctgagc 12180 

catccttggc tgagcaggga gcctcagaag actgtgggta tgcgcatgtg tgtgggggaa 1224 0 

caggattgct gagccttggg gcatctttgg aaacataaag ttttaaaagt tttatgcttc 123 00 

actgtatatg catttctgaa atgtttgtat ataatgagtg gttacaaatg gaatcatttt 12360 

atatgttact tggtagccca ccactcccta aagggactct ataggtaaat actacttctg 1242 0 

caccttatga ttgatccatt ttgcaaattc aaatttctcc aggtataatt tacactagaa 12480 

gagatagaaa aatgagactg accaggaaat ggataggtga ctttgcctgt ttctcacaga 12540 

gcctgctgtc tcctgtggct tttgggtttg gctgtgagta ctttgccctt tttgaggagc 12600 

agggcattgg agtgcagtgg gacaacctgt ttgagagtcc tgtggaggaa gatggcttca 12 660 

atctcaccac ttcggtctcc atgatgctgt ttgacacctt cctctatggg gtgatgacct 12 72 0 

ggtacattga ggctgtcttt ccaggtacac tgctttgggc atctgtttgg aaaatatgac 12780 

ttctagctga tgtcctttct ttgtgctaga atctctgcag tgcatgggct tccctgggaa 12 84 0 

gtggtttggg ctatagatct atagtaaaca gatagtccaa ggacaggcag ctgatgctga 12 90 0 

aagtacaatt gtcactactt gtacagcact tgtttcttga aaactgtgtg ccaggcagca 12 960 

tgcaaaatgt tttatacaca ttgcttcatt taattctcac aaggctactc tgaagtagtt 13 02 0 

actataataa ccagcaattt tcaaatgaga gaactgtgac tcaaagacgt taagtaacca 13 080 

gctttggtca cacaactgtt aaatgttggt acgtggaggt gaatccactt cggttacact 1314 0 

gggtcaataa gcccaggcga atcctcccaa tgctcaccca attctgtatt tctgtgtcct 13200 

cagagggggt acaactagga gaggttctgt ttcctgagta caggttgtta ataattaaat 13260 

atactagctc taaggcctgc ctgtgattta attagcattc aataaaaatt catgttgaat 1332 0 

ttttctttag tacttctttc ttaatataat acatcttctt gaccaagtcc aagaggaacc 133 80 

tgcgttggac agttttcata tgagatcaaa ttctgagaga gcaagattta accctttttg 13 44 0 

gttcaccttc tgatcctccc ctaaggaggt atacatgaaa tatttattac tcctgcctga 13500 

acttctttca ttgaatatgc aattttgcag catgcagatt ctggatttaa attctgagtc 13560 

ttaacttact ggctgaggga ccttggatag gctccttatc cctcagtttc ctcatctcta 13620 

aaatggggat ggcacctgcc ccgtgggttg ttggaaggac ttacagaggt gcagaatgta 13680 

cgttgtacat agcaggtttc agcaaatgtt agctccctct ttccccacat ccattcaaat 13740 

ctgttccttc tccaaaggat gtgtcaagga ggaaatggac ctggctggga aaccctcaga 13 8 00 

atactgggat gatgctgagc ttggctcata cctgtgcttt gctttcaggc cagtacggaa 13860 

ttcccaggcc ctggtatttt ccttgcacca agtcctactg gtttggcgag gaaagtgatg 13 92 0 

agaagagcca ccctggttcc aaccagaaga gaatgtcaga aagtaagtgc tgttgacctc 13 98 0 

ctgctctttc tttaacctag tgctgctgcc tctgctaact gttgggggca agcgatgtct 14040 

cctgcctttc taaaagactg tgaaaccact ccaggggcag agaaatcaca tgcagtgtcc 1410 0 

ctttccaaat cctcccatgc catttatgtc caatgctgtt gacctattgg gagttcacgg 14160 

tctcgatccc tgagggacat tttctttgtt gtcttggctt ctagaagagt atcttttact 1422 0 

tgccccctcc caaacacaca tttcatggtc tcctaacaag ctagaagaaa gaggtaaaga 14280 

caagcgtgat tgtggaacca tagcctcgct gcctgcctgt gacatggtga cctgtgtatc 14340 

agcctgtgtg ggctgagacc aagtggctac cacagagctc agcctatgct tcataatgta 14400 

atcattaccc agatccctaa tcctctcttg gctcttaact gcagacagag atgtccacag 14460 

ctcatcaaag gctctgcttc tgggttcttt gtgcttagag tggcttccta aatatttaat 14520 

aggtcccttt tctgccagtc tcttctgtgc ccatcccctg attgcccttg gtaaaagtat 14580 

gatgcccctt agtgtagcac gcttgcctgc tgttcctaat catcttctcc tacctcctct 14640 



28 



ttacacctag ctcctgtttc agtcacctag aaatgctcac agtcgctgga atatgtcatg 14700 

ttcttccaca cctccatgcc tttgtaggta ctgtttgctc tcacaggaga actttctctc 14760 

taacttgcct atcttctcaa ctcctccttt ctctccaaga tctagttccg gatcccctcc 14820 

cctgagcatc cctccttggt tctcaggtag tcagtcactc tctgccctga acttccatgg 14880 

cacgtgaaag aaaatctttt tattttaaaa caattacaga ctcacaagaa gtaatacaaa 14 940 

ttacatgagg gggttccctt aaacctttca tccagtttcc ccaatggtag cagcatgtgt 15 0 00 

aactgtagaa tagtatcaaa accatgaaat tgacataggt acaattcaca aaccttcttc 15 060 

agatttcact agctttatgt gcgctcattt gtgtgtgtgt gtgcgtattt agttctatgc 1512 0 

aattttatca tgtgtgaatt catgtaatta ctagctcagt caagctgcag aaatatctca 15180 

ttgtcacaaa gctccttcat gctacccctt aatggccaca gccacctccc ttcttcctca 15240 

gttcctgaca cctgtcaacc actaatgcgt tcctcgtttt tacagtttta ttatttctag 153 00 

aatgttacat aaatggaacc atacagtagg tatccttttg atactggctt tttttttttt 153 60 

ttcactcagc agtattccct tagatctatc caagttgtgt gtgtcaacag ttcattcctc 15420 

ttcactgctg agtagtgttc cctgggaggg gtgtatcaca gttccatggc atttttagat 154 8 0 

gtatttttta aacagctttc agcatcctct attttaattg ttcatcaagt cctttttccc 15540 

aatagactct gaatgctcct ttatcatcgt attcccatca ccaacatcag tacccaaata 15600 

ggccctaaat aaacatttat agcctcctgc ctgcctgaga aaccagggtg gacatggaga 15660 

gaaggcactt ctgaaagttc aagcgcagtg csctgtgtcc ttacactcca ctcctcagtg 1572 0 

ctttctgtgg gttcatttct gtcttctctc ctgtcacagt ctgcatggag gaggaaccca 15780 

cccacttgaa gctgggcgtg tccattcaga acctggtaaa agtctaccga gatgggatga 15 84 0 

aggtggctgt cgatggcctg gcactgaatt tttatgaggg ccagatcacc tccttcctgg 1590 0 

gccacaatgg agcggggaag acgaccacca tgtaagaaga gggtgtggtt cccgcagaat 15960 

cagccacagg agggttctgc agtagagtta gaaatttata ccttaggaaa ccatgctgat 16 02 0 

ccctgggcca agggaaggag cacatgagga gttgccgaat gtgaacatgt tatctaatca 16080 

tgagtgtctt tccacgtgct agtttgctag atgttatttc ttcagcctaa aacaagctgg 1614 0 

ggcctcagat gacctttccc atgtagttca cagaattctg cagtggtctt ggaacctgca 162 0 0 

gccacgaaaa gatagattac atatgttgga gggagttggt aattcccagg aactctgtct 16260 

ctaagcagat gtgagaagca cctgtgagac gcaatcaagc tgggcagctg gcttgattgc 1632 0 

cttccctgcg acctcaagga ccttacagtg ggtagtatca ggaggggtca ggggctgtaa 163 8 0 

agcaccagcg ttagcctcag tggcttccag cacgattcct caaccattct aaccattcca 16440 

aagggtatat ctttgggggg tgacattctt ttcctgtttt ctttttaatc tttttttaaa 16500 

acatagaatt aatatattat gagcttttca gaagattttt aaaaggcagt cagaaatcct 16560 

actacctaac acaaaaattg tttttatctt tgaataatat gttcttgttt gtccattttc 1662 0 

catgcatgcg atgttaggca tacaaaatac attttttaaa gaatactttc attgcaaatt 16680 

ggaaacttcg tttaaaaaat gctcatacta aaattggcat ttctaaccca taggcccact 1674 0 

tgtagttatt taccgaagca aaaggacagc tttgctttgt gtgggtctgg tagggttcat 168 0 0 

tagaaaggaa tgggggcggt gggagggttg gtgttctgtt ctctctgcag actgaatgga 1686 0 

gcatctagag ttaagggtag gtcaaccctg acttctgtac ttctaaattt ttgtcctcag 16920 

gtcaatcctg accgggttgt tccccccgac ctcgggcacc gcctacatcc tgggaaaaga 1698 0 

cattcgctct gagatgagca ccatccggca gaacctgggg gtctgtcccc agcataacgt 17040 

gctgtttgac atgtgagtac cagcagcacg ttaagaatag gccttttctg gatgtgtgtg 1710 0 

tgtcatgcca tcatgggagg agtgggactt aagcatttta ctttgctgtg tttttgtttt 1716 0 

ttcttttttt cttttttatt tttttgagat ggagtctcgc tctgtagcca ggctggactg 17220 

tagtggcgcg atctcggctc actgcaacct tggcctccca ggttcaagcg attctcctgc 172 8 0 

ctcagcctcc cgagtagctg ggactctagg cacacaccac catgcccagc taatttttgt 17340 

gtttttagta gagacggggt ttcaccatgt tggccaggat ggtctcaatg tcttgacctc 17400 

gtgatccgcc cacctcggtc tcccaaagtg ctgggaacac aggcatgagc cactgtgtct 1746 0 

ggccacattt tactttcttt gaatatggca ggctcacctc cgtgaacacc ttgagaccta 17520 

gttgttcttt gattttagga gaagtgggag gtgaatggtt gagctgtaga ggtgacatca 1758 0 

gcccagccag tggatggggg cttgggaaac attgcttccc attattgtca tgctggaggg 1764 0 

ccctttagcc catcctctcc ccccgccacc ctccttattg aggcctggag cagacttccc 17700 
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agacctggta gtgcttcagg gccctggtat gatggaccta tatttgctgc ttaagacatt 177 60 

tgctcccact caggttgtcc catcagccat aaggccccca gggagcccgt gtgatggagc 17 82 0 

agagagagac ctgagctctg caatcttggg caaggctttt cccttatgtt tcttcttatc 17 8 80 

taaagtgaac agctggggct catgtgctcc ctcctcatct aaagtgaaca catggggctc 17 940 

atgtgcaggg tcctccccgc tttcagagcc tgaggtcccc tgaggctcag gaaggctgct 18000 

ccaggtgagt gccgagctga cttcttggtg gacgtgctgt ggggacagcc cattaaagac 18060 

cacatcttgg ggccctgaaa ttgaaagttg taactgcctg gtgcatggtg gccaggcctg 18120 

ctggaaacag gttggaagcg atctgtcacc tttcactttg atttcctgag cagctcatgt 18180 

ggttgctcac tgttgttcta ccttgaatct tgaagattat ttttcagaaa ttgataaagt 18240 

tattttaaaa agcacgggga gagaaaaata tgcccattct catctgttct gggccagggg 183 0 0 

acactgtatt ctggggtatc cagtagggcc cagagctgac ctgcctccct gtccccaggc 183 60 

tgactgtcga agaacacatc tggttctatg cccgcttgaa agggctctct gagaagcacg 18420 

tgaaggcgga gatggagcag atggccctgg atgttggttt gccatcaagc aagctgaaaa 184 8 0 

gcaaaacaag ccagctgtca ggtgcggccc agagctacct tccctatccc tctcccctcc 1854 0 

tcctccggct acacacatgc ggaggaaaat cagcactgcc ccagggtccc aggctgggtg 1860 0 

cggttggtaa cagaaacttg tccctggctg tgcccctagg tcctctgcct tcactcactg 18660 

tctggggctg gtcctggagt ttgtcttgct ctgttttttt gtaggtggaa tgcagagaaa 18720 

gctatctgtg gccttggcct ttgtcggggg atctaaggtt gtcattctgg atgaacccac 18780 

agctggtgtg gacccttact cccgcagggg aatatgggag ctgctgctga aataccgaca 18 840 

aggtgcctga tgtgtattta ttctgagtaa atggactgag agagagcggg gggcttttga 18 900 

gaagtgtggc tgtatctcat ggctaggctt ctgtgaagcc atgggatact cttctgttak 18 96 0 

cacagaagag ataaagggca ttgagactga gattcctgag aggagatgct gtgtctttat 19020 

tcatcttttt gtccccaaca tggtgcacta aatttatggt tagttgaaag ggtggatgct 19080 

taaatgaatg gaagcggaga ggggcaggaa gacgattggg ctctctggtt agagatctga 1914 0 

tgtggtacag tatgaggagc acaggcaggc ttggagccaa ctctggcttg gccctgagac 192 0 0 

attgggaaag tcacaacttg cctcaccttc tttgccgata ataatagtgg tgcgttacct 192 60 

catagaggat taaattaaat gagaatgcac acaaaccacc tagcacaatg cctggcatat 19320 

agcaagttcc caaataaaat gcgtactgtt cttacctctg tgaggatgtg gtacctatat 193 8 0 

atacaaagct ttgccattct aggggtcata gccatacagg gtgaaaggtg gcttccaggt 1944 0 

ctcttccagt gcttacccct gctaatatct ctctagtccc tgtcactgtg acaaatcaga 19500 

actgagaggc ctcacctgtc ccacatcctt gtgtttgtgc ctggcaggcc gcaccattat 19560 

tctctctaca caccacatgg atgaagcgga cgtcctgggg gacaggattg ccatcatctc 1962 0 

ccatgggaag ctgtgctgtg tgggctcctc cctgtttctg aagaaccagc tgggaacagg 19680 

ctactacctg accttggtca agaaagatgt ggaatcctcc ctcagttcct gcagaaacag 1974 0 

tagtagcact gtgtcatacc tgaaaaaggt gagctgcagt cttggagctg ggctggtgtt 19800 

gggtctgggc agccaggact tgctggctgt gaatgatttc tccatctcca ccccttttgc 19860 

catgttgaaa ccaccatctc cctgctctgt tgcccctttg aaatcatatc atacttaagg 19920 

catggaaagc taaggggccc tctgctccca ttgtgctagt tctgttgaat cccgttttcc 19980 

ttttcctatg aggcacanag agtgatggag aaggtcctta gaggacatta ttatgtcaaa 2 004 0 

gaaaagagac ttgtcaagag gtaagagcct tggctacaaa tgacctggtc gttcctgctc 2 010 0 

attacttttc aatctcattg accttaactt ttaaactata aaacagccaa tatttattag 20160 

gcactgattt catgccagag acactctggg cattgaaaga aagtaatgat aatagttaat 2 022 0 

tttatatagc gttgttacca tttcaacctt tttttttttt taacctctat catctcaatt 20280 

aaag 20284 



<210> 22 

<211> 7052 

<212> DNA 

<213> Homo sapiens 
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<400> 22 

gtgaacacac attaaagcat gagaagcatg aactagacat gtagccaggt aaaggccttg 60 

ctgagatggt tggcaaaggc ctcattgcag cat teat tgg caggccacag ttcttttggc 12 0 

agctctgctt cctgaccttt caccctcagg aagcgaggct gttcacacgg cacacacatg 180 

ccagacaggg tcctctgaag ccacggctgc cagtgcatgt gtcccaggga aagctttttc 240 

ctttagttct cacacaacag agcttcttgg aagccctccc cggcgaaggt gctggtggct 3 00 

ctgccttgct ccgtccctga cccgttctca cctccttctt tgccatcagg aggacagtgt 3 60 

ttctcagagc agttctgatg ctggcctggg cagcgaccat gagagtgaca cgctgaccat 42 0 

cggtaaggac tctggggttt cttattcagg tggtgcctga gcttccccca gctgggcaga 4 80 

gtggaggcag aggaggagag gtgcagaggc tggtggcgct gactcaaggt ttgctgctgg 54 0 

gctggggctg ggtggctgcg ggggtgggag cagcttggtg gcgggttggc ctaatgcttg 600 

ctggggtgcc tggggctcgg tttgggagct agcagggcag tgtcccagag agctgagatg 660 

attggggttt ggggaatccc ttaggggagt ggacactgaa taccagggat gaggagctga 72 0 

gggccaagcc aggagggtgg gatttgagct tagtacataa gaagagtgag agcccaggag 78 0 

atgaggaaca gccttccaga tttttcttgg gtagcgtgtg taggaggcca gtgtcaccag 84 0 

tagcatatgt ggaacagaag tcttgaccct tgctatctct gcctagtcct aatggctggc 900 

ttttcccagg aaggcttctg cttccatgga ctgttagatt aaccctttat ttaggtaaat 960 

gagggaacct actttataag cataggaaag ggtgaagaat cttttaagat tcctttactc 102 0 

aagttttctt ttgaagaatc ccagagctta ggcaatagac accagacttt gagcctcagt 10 8 0 

tatccattca cccatccacc cacccaccca cccatccttc catcctccca tcctcccatt 1140 

cacccatcca cccatccagc tgtccaccca ttctacactg agtacctata atgtgcctgg 1200 

ctttggtgat acaaaggtga ataagacata gtcctttcct ttgcccccaa ccctcagacc 1260 

agagatgaac atgtggaatg acctaaacac ctggaacagg tgtggtgtat gagcggcagg 132 0 

cctctgatga gagggtgggg gatggccagc cctcactccg aagcccctct gagttgattg 13 8 0 

agccatcttt gcattctggt cctgcagatg tctctgctat ctccaacctc atcaggaagc 1440 

atgtgtctga agcccggctg gtggaagaca tagggcatga gctgacctat gtgctgccat 15 00 

atgaagctgc taaggaggga gcctttgtgg aactctttca tgagattgat gaccggctct 156 0 

cagacctggg catttctagt tatggcatct cagagacgac cctggaagaa gtaagttaag 162 0 

tggctgactg tcggaatata tagcaaggcc aaatgtccta aggccagacc agtagcctgc 168 0 

attgggagca ggattatcat ggagttagtc attgagtttt taggtcatcg acatctgatt 174 0 

aatgttggcc ccagtgagcc atttaagatg gtagtgggag atagcaggaa agaagtgttt 18 0 0 

tcctctgtac cacagtacat gcctgagatt tgtgtgttga aaccagtggt acctaacaca 1860 

tttacatccc aaccttaaac tcctatgcac ttatttaccc tttaatgagc ctctttactt 192 0 

aagtacagtg kgaggaacag cggcatcagg atcacttggg aacttgttag aaattcagca 1980 

acttgggccc agctcagacc tactgaatca gaatcaggag caattctctg gtgtgactgt 2 04 0 

gtcacagcca ggtatcaact ggattctcat acataggaaa tgacaaacgt ttatggatgg 210 0 

atagtctact tgtgccaggt gctgagattt gttttttgtt ttttgatttt tttttaatca 2160 

ctgtgacctc atttaattct caaaaaaaga tgaaaaaatg aacactcagg aatgctgaca 2220 

tgagattcag aatcaggggt ttggggcttc aaagtccatc ctctctttat ccatgtaatg 2280 

cctcccctta gagatacaac atcacagacc ttgaaggctg aaggggatat aaaagctgtc 2 34 0 

tggccaagtg gtctccaagc ttgacagtgc agcagaatca cctggggata ttattaaaaa 24 00 

taaacatact aaggtttggc ttcagggcct gtgaatcaga atttctggag gtgaggcctt 24 60 

gaagtctgta tttctattgc atactttgga cacagtggtc tatagactag agtttggaaa 2 52 0 

tgattgcgct cattcagatt ctcttctgat gtttgaattg ctgccatcat atttctagtg 2580 

ctctatttcc tcctgctcat tctgtcttgg ataacttatc atagtactag cctactcaaa 2640 

gatttagagc cacagtcctg aaagaagcca cttgactcat tccctgtagg ttcagaataa 2700 

atttcttctg cgcagtgtct gtcatagctt tttttaaatt tttttttatt tttgatgaga 2760 

ctggagtttt gctcttattg cccaagctgg agtgcagtgg tgcgattttg gctcactgca 2 82 0 

acctccacct cccaggttca agcgattctc ctgcctcagc ctcccaagta gctgagatta 2 88 0 

caagcatgtg ctaccacgcc cagctaattt tgtattttta gtagagatgg gttttatcca 2 94 0 

tgttggtcag gctggtctcg agctccagac ctcaggtgat ctgcccgcct cggcctccca 3 00 0 



31 



aagtgctggg attataggcc tgagccacag cgctcagcca taactttaat ttgaaaatga 3 0 60 

ttgtctagct tgatagctct caccactgag gaaatgttct ctggcaaaaa cggcttctct 3120 

cccaggtaac tctgagaaag tgttattaag aaatgtggct tctactttct ctgtcttacg 3180 

gggctaacat gccactcagt aatataataa tcgtggcagt ggtgactact ctcgtaatgt 3240 

tggtgcttat aatgttctca tctctctcat tttccagata ttcctcaagg tggccgaaga 3300 

gagtggggtg gatgctgaga cctcaggtaa ctgccttgag ggagaatggc acacttaaga 33 60 

tagtgccttc tgctggcttt ctcagtgcac gagtattgtt cctttccctt tgaattgttc 3420 

tattgcattc tcatttgtag agtgtaggtt tgttgcagat ggggaaggtt tgttttgttg 34 80 

taaataaaat aaagtatggg attctttcct tgtgccttca gatggtacct tgccagcaag 3 54 0 

acgaaacagg cgggccttcg gggacaagca gagctgtctt cgcccgttca ctgaagatga 3 6 00 

tgctgctgat ccaaatgatt ctgacataga cccaggtctg ttagggcaag atcaaacagt 3 6 60 

gtcctactgt ttgaatgtga aattctctct catgctctca cctgttttct ttggatggcc 3720 

tttagccaag gtgatagatc cctacagagt ccaaagagaa gtgaggaaat ggtaaaagcc 3 7 80 

acttgttctt tgcagcatcg tgcatgtgat caaacctgaa agagcctatc catatcactt 3 84 0 

cctttaaaga cataaagatg gtgcctcaat cctctgaacc catgtattta ttatcttttc 3 900 

tgcggggtcc tagtttcttg tatacattag gtgtttaatt gttgaacaaa tattcattcg 3 960 

agtagatgag tgattttgaa agagtcagaa aggggaattt gctgttagag ttaattgtac 4 02 0 

cctaagactt agatatttga ggctgggcat ggtggctcat gccagtaatc ccagcgcttt 4080 

gagaggctga ggtgggtaga tcacctgagg tcaggagttt gagaccagtc tgaccaacaa 414 0 

ggtgaaaccc cgtctctact aaatacaaaa aattagccga gtgtggtggc acatgcctgt 42 0 0 

catcccagct acttgggagg ctgaggcagg agaatcgctt gaacccagga ggcagaggtt 42 60 

gcagtcagcc acggttgcgc cattgcactc cagactgggc aacaagagtg aaaactccat 4320 

ctcaaaaaag aaaaaaaaag aattagatat tttggatgag tgtgtctttg tgtgtttaac 43 80 

tgagatggag aggagagcta agacatcaaa caaatattgt taagatgtaa aagcacatca 444 0 

gttaggtatc attagtttag gacaaggatt tctagaaaat ttttaggaac agaaaacttt 4500 

ccagttctct cacccctgct caaagagtgt atggctctta cattatatat aactgcctga 45 60 

cttcatacag tatcagtact tagatcattt gaaatgtgtc cacgttttac caaaatataa 4620 

tagggtgaga agctgagatg ctaattgcca ttgtgtattc tcaaatatgt caagctacgt 4 68 0 

acatggcctg tttcatagag tagtctataa gaaattgatg acttgattca tccgaatggc 4740 

tggctgtaac acctggttac gcatgaacac ctcttttcag ttgtctcaag acacctttct 4800 

tttctgtact tatcagacaa ggactgaaag gcagagactg ctactgttag acattttgag 4860 

tcaagctttt ccttggacat agctttgtca tgaaagccct ttacttctga gaaacttcta 492 0 

gcttcagaca catgccttca agatagttgt tgaagacacc agaagaagga gcatggcaat 4980 

gccgaaaaca cctaagataa taggtgacct tcagtgttgg cttcttgcag aatccagaga 5 04 0 

gacagacttg ctcagtggga tggatggcaa agggtcctac caggtgaaag gctggaaact 510 0 

tacacagcaa cagtttgtgg cccttttgtg gaagagactg ctaattgcca gacggagtcg 516 0 

gaaaggattt tttgctcagg tgagacgtgc tgttttcgcc agagactctg gcttcatggg 52 2 0 

tgggctgcag gctctgtgac cagtgaaggc aggatagcat cctggtcaag atatggatgc 52 8 0 

cggagccaga tttatctgta tttcaatccc agttctattc cttgccagtt gtgtatccgc 5340 

tggcaagtta cttctctatg cctcaatctc ctcatctgta aaatggggat aataatatta 5400 

cctgcaatac agggttgtta cgaaaataaa aatgaatagg tgcttagaat ggggcctgac 5460 

attagtaagt gcttagtttt gtgtgtgtat atgttatttt tattttggag gagaacataa 5520 

aaaggacaaa gtgtagaaaa actggttggg tgtattcagc tgtcataaca tgagagttgt 558 0 

tatgcccaga tgcacttgac atgtgaattt attagaaaca tgatttttct ctgagttgat 564 0 

gtttaactca aactgataga aaagataggt cagaatatag ttggccaaca gagaagactt 5700 

gttagactat tgtctgcatg tcagtgtttg catgctaact tgcttagtta gaaaggttaa 5760 

attttttcac tctataaaat caagaaatat agagaaaagg tctgcagaga gtctttcatt 5 82 0 

tgatgatgtg gatattgtta agagcgggag tttggagcat acagagctca agttgaatcc 5880 

tgactttgct acttattggc tatatgacct tgggcaagct gcttagtctc tctgatcctc 5940 

agttaccttt gtttgttgat gatgaccatt gataacacaa ccataaataa tgacaacata 6000 

gagatagttc tcattatagt agttgttata cagaattatt cactcaatgt taattttctg 6060 
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cattgaaatc ccagaacatt agaattgggg gcattatttg aatctttaag gttataagga 612 0 

atacatttct cagcaataaa tggaaggagt tttgggttaa cttataaagt atacccaagt 6180 

catttttttt cagagaagat atggtagaaa gtcttaggag gttgaagaag gaattggata 624 0 

tttattcttt ctgagactat catgggagat aatgactatg gttgtccatg attggagccg 63 0 0 

ttgctgtaga gttggtttta ttatagtgta ggatttgaat gggccatgtg ttctcagacc 63 60 

tcagaataaa aagagaaaac tgaggccagt ggggagcgtg acttcacatg ggtacacttg 642 0 

tgctagagac agaaccagga ttcaggactt ctggctcctg gtcctgggtt catggcccaa 64 8 0 

tgtagtcttt ctcagtcttc aggaggagga agggcaggac ccagtgttct gagtcaccct 654 0 

gaatgtgagc actatttact tcgtgaactt cttggcttag tgcctctgcc aggtggccat 66 0 0 

aacctctggc cttgtgttgc cagagaaaag gtttagtttt caggctccat tgcttcccag 6660 

ctgccaagaa tgccttggtg cagcacagtc ataggccctg cattcctcat tgccgtgctg 672 0 

gttggtcggg gaggtgggct ggactcgtag ggatttgccc cttggccttg tttctaacac 6780 

ttgccgtttc ctgctgtccc cctgccccct ccactgcctg ggtaaagatt gtcttgccag 6840 

ctgtgtttgt ctgcattgcc cttgtgttca gcctgatcgt gccacccttt ggcaagtacc 6900 

ccagcctgga acttcagccc tggatgtaca acgaacagta cacatttgtc aggtatgttt 6 960 

gtcttctaca tcccaggagg gggtaagatt cgagcagacc aaagatgttt acgagggcca 7 02 0 

aggg^^tgga cttcagaatt acacggtgga at 7 052 



<210> 23 
<211> 2534 
<212> DNA 

<213> Homo sapiens 
<400> 23 

gggaagcatt taaaaaaaaa aaagtatata tatatatata tatatatata tgtaatgtga 60 

attggcctct ttttctctaa gcccacattt tcttcttaca tagttcaggt ttactttatt 120 

ttttcctttc cggctgctga ccctgtattg cccgtagttg tggaacatag catgtgtttg 18 0 

tgacctgtgc ctgttatttt tgtgctttct agttgtgcat gcaaagagta caaagttttc 24 0 

ttgccctttc ttggaaaatc ctgcttgtct gtgccaaagg gataattgtg aaagcacttt 3 00 

tgaaatactt aatgagttga ttttcttcaa attaaaaaaa atatataaat gtatatgtgt 360 

atgtacatgt gtgtacacat acacaccttt atacatacag cccatttaaa acaagctcca 420 

ctttggagtg ctctacgtca ccctgatgcc gaatacaggg ccagagtctg agatccttct 4 80 

gggtggtttc tgtgttttgt tcatttctgt tttaagagcc tgtcacagag aaatgcttcc 54 0 

taaaatgttt aatttataaa aacattttta tctctcgatt actggtttta atgaattact 600 

aagctggctg cctctcatgt acccacagca atgatgctcc tgaggacacg ggaaccctgg 660 

aactcttaaa cgccctcacc aaagaccctg gcttcgggac ccgctgtatg gaaggaaacc 72 0 

caatcccgtg agtgccactt tagccataag cagggcttct tgtgcttgtt gcctggtttg 7 80 

atttctaata tgctgcattt atcaactgca tgccacattg tgaccgccag catttgccct 840 

ttgaattatt attatgtttt atttacaaaa agcgaaggta gtaaccgaac taaattatct 90 0 

aggaacaaac gtttggagag tcttctaaca ccgyscaaag cacgtcatta cagacatttg 960 

tttactgatt tagaacctta atatttaatt taaatacgca ctttacactt actgatgaaa 1020 

tgcttttcct ttctttctct cccagcccct gtacttaagt gcttcaatag gctctcatta 1080 

tatatgattt ttaggttttg cttatcagct tcttcgcttt tataatctga aaagatggca 1140 

tatgaatttt tataaaaagg gacactttct tcttctcaaa ttgtatattt ttattgtact 12 00 

ttccttcaaa accccctttt aaaaagtaag cagtggataa ataaattcag tgaagcatcc 12 60 

atatgaccct taagtgagtg taggggaagg gaggt caeca gatcactgtg agtgaagatg 132 0 

gtggagaggt gaggatctta tgaggccgtg ctcaaggctg gtagaggtgg gttagtgttt 13 8 0 

ccaggtttag gcagaatctc agctgaggtc atgaaacaac agtgatctct gaaaaattat 144 0 

ggcaaggtgg gaaggtgctg gagaattgga gagggggcaa acttgacttt caagtttcaa 1500 
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tgggaagata ggtgactctg cacaccacag aacagtgagc atgataacct gtttatacaa 15 60 

ggttctagag cagatttcta aatggatagc tactgtgtgc ttgtttgttc ttaattagta 1620 

ttggatagtt actaaatact tgttagtact tagtacataa tgggtggtaa atcctagcag 1680 

ctaatattgg ttcccaaata accagatgac aaggatagag aaggacacag acacggccta 1740 

tctggatttc atggtgcctt tgattttcca catgaaggtt gtgtagggaa gatagaagca 1800 

tgagatgaga tgataatata gttatctgga ttcatcactg gccagctgaa ccatatgaac 1860 

tcatggattg atgctagctt aggaaggctc tgtaggagcc agaactgggc tgagagccag 192 0 

cccatagaga caaaagaggc ccggccctga catcagaggg ttcaaacatg atgtctgagc 198 0 

cccacctaca gtctgccgga ggtggttgga aggaagagcc tttatcctta caattcttac 2 04 0 

tgaaattcaa atttttaggt tttgcaaaaa aatggtggac ctgaaggaaa tttgacagga 210 0 

gcatgtctca gctgtattta aatttgtctc agccaatccc cttttgaatg ttcagagtgt 2160 

aagcttcagg agggcagcgc gtcttagtgt gacttttctg gtcagttcag gtgctttaag 2220 

gagacaatta gagatcaatc tggaaaactt catttgaatt tttaatacat aagaaaacaa 22 8 0 

taagaaatag ttaaaaatat atatttatat aatatatata tgtgtgtgtg tgtgtgtgtg 234 0 

tgtgtgtgtg tatatatata tatattttat ttatttattt ttttttgaga tggagtctcg 24 0 0 

ctctgttgcc caggctggag tgcagtggct caatcttggc tcactgccac ctctgcctcc 2460 

caggttcaag tgattctcct acctcagcct cctgagtagc tgggattaca agcatgtgcc 252 0 

accacactgg ctaa 2534 

<210> 24 

<211> 2841 

<212> DNA 

<213> Homo sapiens 

<400> 24 

tcttgccagt ctctactcat ttttcagcac atcgagcata agatccagac tctttcccag 60 

gcctctctca tctggctcct ctcctcctcc tttatcatta ctcttcttcg tagcttatcc 120 

tactccagcc atgctgtctt cctattattc ctaaaaarta gaaatgcatt tcttcctagg 180 

gcctttgtac ctgcacttgc catcgctttt gctcagaatg ttctttttgc caagcttttg 240 

cccagcttgt tctccatcat tgttatgttt tggctgaaat gtcttctctt agtaggttca 3 00 

ttctccccag tcactgtctt tttattttgc tttattttgg gccatctaag gttatcttat 360 

tagtgtattt gttgttcgtc tcctccatgg gcatacacct ccatgaaggc aggtattttc 42 0 

accttaggcc ctcgaatata ctggacagca tctggcacgt agtagatgct caacgaatgt 480 

ttgttgtgtg agcaaatggt tggttgattg gattgaactg agttcagtat gtaaatattt 54 0 

agggcctctt tgcattctat tttacttatg tataaaatga tacataatga tgatataaat 60 0 

gatgtcacag tgtacaaggc tgttgtggga tcaagcaatc aaatgagatc atgcttgtct 660 

tttccaaatg gtgagggaat agatgcatgt ttgtggttgt tacggaatga tcctgtgctc 72 0 

ctgaggcaac agaaaggcca ggccatctct ggtaatccta ctcttgctgt cttccctttg 780 

cagagacacg ccctgccagg caggggagga agagtggacc actgccccag ttccccagac 84 0 

catcatggac ctcttccaga atgggaactg gacaatgcag aacccttcac ctgcatgcca 90 0 

gtgtagcagc gacaaaatca agaagatgct gcctgtgtgt cccccagggg caggggggct 96 0 

gcctcctcca caagtgagtc actttcaggg ggtgattggg cagaaggggt gcaggatggg 102 0 

ctggtagctt ccgcttggaa gcaggaatga gtgagatatc atgttgggag ggtctgtttc 10 8 0 

agtctttttt gttttttgtt tttttttctg aggcggagtc ttgctctgtc gcccaggctg 1140 

gagtgctgtg gcatgatctt gcctcactgc aacctccacc tcccaggttc aagcgattct 12 0 0 

cctgcctcag cctcctgagt agctgggatt acaggcacgc accaccatgt ctggctaatt 1260 

tttgtgtttt tagtagagat agggtttcgc cgtgttggct aggctggtct ggaattcctg 132 0 

acctcaggtg atccacccgc ctcggcctcc caaagtgctg ggattacagg cgtgagccac 13 8 0 

tacgcccagc cctgtttcag tctttaactc gcttcttgtc ataagaaaaa gcatgtgagt 1440 

tttgagggga gaaggtttgg accacactgt gcccatgcct gtcccacagc agtaaagtca 15 0 0 

caggacagac tgtggcaggc ctggcttcca atcttggctc tgcaacaaat gagctggtag 1560 
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cctttgacag gcctgggcct gtttcttcac ctctgaatta gggaggctgg accagaaaac 162 0 

tcctgtggat cttgtcaact ctggtattct tagagactct gtttgggaag gagtcctgag 16 8 0 

ccattttttt tttcttgaga atttcaggaa gaggagtgct tatgatagct ctctgctgct 174 0 

tttatcagca accaaattgc aggatgagga caagcaattc taaatgagta caggaactaa 18 0 0 

aagaaggctt ggttaccact cttgaaaata atagctagtc caggtgcggg gtggctcaca 186 0 

cctgtaatct cagtattttg ggatgccgag gtggactgat cacctaaggt caggagttcg 1920 

aaaccagctt ggccaatgtg gcgaaaccct gtctctacta aaaattcaaa aattagccag 1980 

gcatggtggc acatgcctgt aatcccagtt acttgggagg ctgaagcagg agaattgctt 2040 

gaacctggga ggtggaggtc gcagggagcc aaaattgcgc cactgtactc cagcctgagc 210 0 

aacacagcaa aactccatat caaaaaataa aatgaataaa ataacagcta atctagtcat 216 0 

cagtataact ccagtgaaca gaagatttat taggcatagt gaatgatggt gcttcctaaa 222 0 

aatctcttga ctacaaagaa tctcatttca atgtttattg tttagatgtt cagaataaat 22 80 

tcttgggaaa gaccttggct tggtgtaagt gaattaccag tgccgagggc agggtgaacc 2 34 0 

aagtctcagt gctggttgac tgagggcagt gtctgggacc tgtagtcagg tttccggtca 24 00 

cactgtggac atggtcactg ttgtccttga tttgttttct gtttcaattc ttgtctataa 2460 

agacccgtat gcttggtttt catgtgatga cagagaaaac aaaacactgc agatatcctt 2 52 0 

caggacctga caggaagaaa catttcggat tatctggtga agacgtatgt gcagatcata 25 8 0 

gccaaaaggt gactttttac taaacttggc ccctgcctta ttattactaa ttagaggaat 2 640 

taaagaccta caaataacag actgaaacag tgggggaaat gccagattat ggcctgattc 2 70 0 

tgtctattgg aagtttagga tattatccca aactagaaaa gatgacgaga gggactgtga 2 760 

acattcagtt gtcagcttca aggctgaggc agcctggtct agaatgaaaa tagaaatgga 2 82 0 

ttcaacgtca aattttgcca c 2841 

<210> 25 

<211> 852 

<212> DNA 

<213> Homo sapiens 

<400> 25 

gcatgctgga gtgatagtga ccatgagttt ctaagaaaga agcataattt ctccatatgt 6 0 

catccacaat tgaaatatta ttgttaattg aaaaagcttc taggccaggc acggtggctc 12 0 

atgcctgtaa tcccagcact ttaggagcca aggcgggtgg atcacttgag gtcaggagtt 18 0 

tgagaccagc ctggccaaca tggggaaacc ctgtctctac taaaaataca aaataagctg 24 0 

ggcgtggtgg tgcgtgcctg taatcccagc tacttgggag gctgaggcag gagaactgct 3 00 

tgaatctggg aggcggaggt tgcagtgagc tgagttcatg ccattgcatt ccagcctggg 3 60 

caacaagagc gaaaccatct cccaaaagaa aaaaaaaaga aagaaaaagc ttctagtttg 42 0 

gttacatctt ggtctataag gtggtttgta aattggttta acccaaggcc tggttctcat 480 

ataagtaata gggtatttat gatggagaga aggctggaag aggcctgaac acaggcttct 54 0 

tttctctagc acaaccctac aaggccagct gattctaggg ttatttctgt ccgttcctta 600 

tatcctcagg tggatattta ctccttttgc atcattagga ataggctcag tgctttcttt 660 

gaactgattt tttgtttctt tgtctctgca gcttaaagaa caagatctgg gtgaatgagt 72 0 

ttaggtaagt tgctgtcttt ctggcacgtt tagctcaggg ggaggatggt gttgtaggtg 780 

tgcttggatt gaagaaagcc ttggggattg tttgtcactc acacacttgt gggtgccatc 84 0 

tcactgtgag ga 852 

<210> 26 

<211> 6289 

<212> DNA 

<213> Homo sapiens 

<400> 26 
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gctttataga gtttctgcct agagcatcat ggctcagtgc ccagcagccc ctccagaggc 60 

ctctgaatat ttgatatact gatttccttg aggagaatca gaaatctcct gcaggtgtct 12 0 

agggatttca agtaagtagt gttgtgaggg gaatacctac ttgtactttc cccccaaacc 180 

agattcccga ggcttcttaa ggactcaagg acaatttcta ggcatttagc acgggactaa 24 0 

aaaggtctta gaggaaataa gaagcgccaa aaccatctct ttgcactgta tttcaaccca 300 

tttgtccttc tgggttttga aggaacaggt gggactgggg acagaagagt tcttgaagcc 3 60 

agtttgtcca tcatggaaaa tgagataggt gatgtggcta cgtcaggggg cccgaaggct 42 0 

ccttgttact gatttccgtc ttttctctct gccttttccc caagggccag gacccctgga 480 

tctctgggca gagcagacgc aggcccctat aatagccctc atgctagaaa ggagccggag 54 0 

cctgtgtata aggccagcgc agcctactct ggacagtgca gggttcccac tctcccaact 600 

ccccatctgc ttgcctccag acccacattc acacacgagc cactgggttg gaggagcatc 660 

tgtgagatga aacaccattc tttcctcaat gtctcagcta tctaactgtg tgtgtaatca 72 0 

ggccaggtcc tccctgctgg gcagaaacca tgggagttaa gagattgcca acatttatta 780 

gaggaagctg acgtgtaact tctgaggcaa aatttagccc tcctttgaac aggaatttga 84 0 

ctcagtgaac cttgtacaca ctcgcactga gtctgctgct gatgatactg tgcaccccac 900 

tgtctgggtt ttaatgtcag gctgttcttt taggtatggc ggcttttccc tgggtgtcag 960 

taatactcaa gcacttcctc cgagtcaaga agttaatgat gccatcaaac aaatgaagaa 102 0 

acacctaaag ctggccaagg taaaatatct atcgtaagat gtatcagaaa aatgggcatg 1080 

tagctgctgg gatataggag tagttggcag gttaaacgga tcacctggca gctcattgtt 114 0 

ctgaatatgt tggcatacag agccgtcttt ggcatttagc gatttgagcc agacaaaact 1200 

gaattactta gttgtacgtt taaaagtgta ggtcaaaaac aaatccagag gccaggagct 12 60 

gtggctcatg cctgtaatcc tagcactttg ggaggctgaa gcgggtggat cacttgaggt 13 2 0 

caggagttcg agaccagcct ggcctacatg acaaaacccc gtatctacta aaaatacaaa 13 80 

aaaattagct gggcttggtg gcacacacct gtaatcccag ctacttggga ggctgaggca 144 0 

ggagaattgc ttgaaccctg taggaagagg ttgtagtgag ccaagatcgc accgttgcac 150 0 

tccagcctgg gcaacaagag caaaactcca tctcaaaaaa caaattaaat ccagagattt 1560 

aaaagctctc agaggctggg cgcggtggct tacacctgtt atcccagcat tttgggatgc 162 0 

cgaggcgggc aaagcacaag gtcaggagtt tgagaccagc ctggccaaca tagtgaaacc 16 8 0 

ctgtctctgc taaaaacata gaaaaattag ccgggcatgg tggcgtgcgc ctgtaatccc 174 0 

agctactcgg gaggctgagg tgagagaatt rcttgaaccc gggaggcgga ggttgcagtg 18 0 0 

agcccagatt gcaccactgc actccagcct gggcgacaga gcaagactcc atctcaaaaa 1860 

aagctctcag aacaaccagg tttacaaatt tggtcagttg gtaaataaac tgggtttcaa 192 0 

acatactttg ctgaaayaat cactgactaa ataggaaatg aatctttttt tttttttttt 1980 

taagctggca agctggtctg taggacctga taagtactca cttcatttct ctgtgtctca 2 04 0 

ggtttcccat ttttaggtga gaattaaggg gctctgataa aacagaccct aggattgtgg 210 0 

acagcagtga tagtcctaga gtccacaagt ctgcttttga gtgatgggcc catgtatctg 2160 

gcacatctgc aggcagagcg tggttctggc tcttcagatg atgccggtgg agcactttga 2220 

ggagtcctca ccccaccgtg ataaccagac attaaaatct tggggctttg catcccagga 22 8 0 

tttctctgtg attccttcta gacttgtggc atcatggcag catcactgct gtagatttct 2340 

agtcacttgg ttctcaggag ccgtttattt aatggcttca catttaattt cagtgaacaa 24 0 0 

ggtagtggca ttgctcttca cagggccgtc ctgttgtcca caggttccag attgactgtt 2460 

gccccttatc tatgtgaaca gtcacaactg aggcaggttt ctgttgttta caggacagtt 252 0 

ctgcagatcg atttctcaac agcttgggaa gatttatgac aggactggac accagaaata 2580 

atgtcaaggt aaaccgctgt ctttgttcta gtagcttttt gatgaacaat aatccttatg 2 64 0 

tttcctggag tactttcaac tcatggtaaa gttggcaggg gcattcacaa cagaaaagag 2700 

caaactatta actttaccag tgaggcagta cggtgtagtg tagtgattca gagaatttgc 2 760 

tttgccacca gacataccag gtaaccttga ctaagttact taacctatct aaacctcagt 2820 

tycctcatct gtgaaatgga gacagtaatc atagctattt ccaaactgtt gtgagaattc 2880 

aatgagttaa aggtataagg tcctcaccac agcgcctgcc cacatagtca gtgatcacta 2 94 0 

tgtcctgaac actgtaatta cttcgccata ttctctgatc atagtgtttt gccttggtat 3 000 

gtgactagaa tttctttctg aggtttatgg gcatggttgg tgggtatgca cctgcctgca 3 060 
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ggagcccggt ttgggggcat taccttgtac ctggtatgtt ttctttcagg tgtggttcaa 312 0 

taacaagggc tggcatgcaa tcagctcttt cctgaatgtc atcaacaatg ccattctccg 3180 

ggccaacctg caaaagggag agaaccctag ccattatgga attactgctt tcaatcatcc 3240 

cctgaatctc accaagcagc agctctcaga ggtggctctg taagtgtggc tgtgtctgta 33 00 

tagatggagt ggggcaaggg agagggttat ggagaagggg agaaaaatgt gaatctcatt 33 60 

gtaggggaac agctgcagag accgttatat tatgataaat ctggattgat ccaggctctg 342 0 

ggcagaagtg ataagtttac gaattggctg gttgggcttc ttgaactgca gaagagaaaa 34 8 0 

tgacactgat atgtaaaaat cgtaacattt agtgaattca tataaagtga gttcaaaaat 354 0 

tgttaattaa attataattt aattataagt gtttaatcag tttgatttgt ttaaaaacca 3600 

ctgttttaaa tttggtggaa tatgttttta ttagcttgta tctttaattc ctaaattaag 3 660 

ctgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gaagtttaaa 3 72 0 

gccaggatga gctagtttaa agtatgcagc ctttggagtc atacagatct gggtttgaat 3780 

ctggtctcta aactttatag atgtatgata ttaaatgagg cagttcatgt aaattgccaa 3 84 0 

gcccagcact cagcacagag ttgatatttc acacacatta gatacctttc ctgtatgtgg 3 90 0 

agcatggcag ttcctgtttc tgctttactc ctacaggata ctaatatagg acactaggat 3 960 

ctttatacca agaccccatg taatgggctt atgagaccat tcttcttata aaaatctgac 4 02 0 

agaatttttg tatgtgttag atcaataggc tgcatactgt tattttcaag ttgatttaca 4080 

gccagaaata ttaatttatt tgagtagtta cagagtaata tttctgctct catttagttt 4140 

tcaagcccca ctagtccttt gtgtgtgaaa atttacaact tactgctctt acaaggtcat 4200 

gaacagtgga ccaaagtgaa tgccattaac cactctgact tccttcatta gttttattgt 4260 

gacagtggac tcttttgacc tcagtaatac cagtttggca tttacattgt catattttta 4320 

gacttaaaaa tgatcatctt aaccctgaat aaaatgtgtc tggtgaacag atgtttttcc 43 80 

ttggctgtgc ctcagatatc tctgtgtgtg tgtacgtgtg tgtttgtctg tgtgtccatg 444 0 

tcctcactga ttgagcccta actgcatcaa agacccctca gattttcaca cgctttttct 4500 

ctccaggatg accacatcag tggatgtcct tgtgtccatc tgtgtcatct ttgcaatgtc 4560 

cttcgtccca gccagctttg tcgtattcct gatccaggag cgggtcagca aagcaaaaca 4 62 0 

cctgcagttc atcagtggag tgaagcctgt catctactgg ctctctaatt ttgtctggga 4 68 0 

tatggtaagg acacaggcct gctgtatctt tctgatgtct gtcagggcca tggattgata 474 0 

tggataagaa agaaagagct ctggctatca tcaggaaatg ttccagctac tctaaagatg 4800 

tatgaaaaag aaatagccag aggcaggtga tcactttcat gacaccaaac acagcattgg 4860 

gtaccagagt tcatgtcaca ccagagggaa aattctgtac acaatgatga aaattaatac 4920 

cactaccact taagttccta tgtgacaact ttcccaagaa tcagagagat acaagtcaaa 4980 

actccaagtc aatgcctcta acttctctga tgggttttaa cctccagagt cagaatgttc 504 0 

tttgccttac taggaaagcc atctgtcatt tagaaaactc tgtacatttt atcagcagct 5100 

tatccatcca ttgcaaatat tgtttttgtg ccasccacaa tatattgctt ctatttggac 5160 

caatatgggg gatttgaagg aattctgaag ttctaattat atttcaactc tactttacaa 522 0 

tatctccctg aaatatatct ccctgtaact tctattaatt ataagctaca cagagcaaat 52 8 0 

ctaattcttc tcccaccgaa caagtccctg gatatttaaa aataactctc atactctcat 5340 

ttaacctgag tattacccag ataagatgat atatgagaat acaccttgta acctccgaag 54 00 

cactgtacaa atgtgagcaa tgatggtgga gatgatgatg agatctttgc tgtttatacc 5460 

aagcccctta gactgtgtca ctcttctgat ccggttgtcc ttgtatggcc atgctgtata 552 0 

ttgtgaatgt cccgttttca aaagcaaagc caagaattaa ccttgtgttc aggctgtggt 55 8 0 

ctgaatggtt atgggtccag agggagttga tctttagctc acacttctat tactgcagca 564 0 

caaagatttt gcattttgga aggagcaccg tcttactggc aacttagtgg taaaccaaaa 5700 

cctccatttc acacaaatga ttgtgaaatt cgggtctcct tcattctata caaattcatt 5760 

tgattttttt gaaactaaac tttatattta tccatattaa attacatggg ttttattttt 5820 

gttttatctt gattcagtaa ttactccttt cagtaaacac agactgagtg ctgtgtgtct 5880 

gacttatgcc aggcataggt gattcagaga tgaaaggtca agtccctgaa cccatctctt 5 94 0 

gtcttcctgg gtattatctg tccctccctg ctttagagct cctgaaattt gctagaagca 6000 

tgtcttcatc taagttgttg ataaacacat caagtaggat tggactgagg cagagccctg 6060 

tagtctgaag ctgcagttct tctagcggct gacaagcccc actatcactt ccctgctggt 612 0 
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gctttgctct gccagctgtg aattctcata attgtcctat cgtcaagtct ttatttctgc 
attttactgc ttgatacact gtcaggacag actttaaaat tattctcagt gcgatgaaac 
aattctgaca ttcatgttat gagcagttac ctcataaata gattacatg 



6180 
6240 
6289 



<210> 27 

<211> 4244 

<212> DNA 

<213> Homo sapiens 

<400> 27 

aaattactct gactgggaat ccatcgttca gtaagtttac tgagtgtgac accttggctt 60 

gactgttgga aagacagaaa gggcatgtag tttataaaat cagccaaggg gaaaatgctt 12 0 

gtcaaaatgt attgtcgggt attttgatta atagtttatg tggcttcatt aattcagagt 180 

tactctccaa tatgtttatc tgccctttct tgtctgataa tggtgaaaac ttgtgtgatg 240 

cattgtatat ttgatttagg ggtgaactgg atgtctttgt tttcactttt agtgcaatta 300 

cgttgtccct gccacactgg tcattatcat cttcatctgc ttccagcaga agtcctatgt 360 

gtcctccacc aatctgcctg tgctagccct tctacttttg ctgtatgggt aagtcacctc 42 0 

tgagtgaggg agctgcacag tggataaggc atttggtgcc cagtgtcaga aggagggcag 480 

ggactctcag tagacactta tctttttgtg tctcaacagg tggtcaatca cacctctcat 54 0 

gtacccagcc tcctttgtgt tcaagatccc cagcacagcc tatgtggtgc tcaccagcgt 60 0 

gaacctcttc attggcatta atggcagcgt ggccaccttt gtgctggagc tgttcaccga 660 

caatgtgagt catgcagaga gaacactcct gctgggatga gcatctctgg gagccagagg 72 0 

acagtgttta attgtgatct tattccactt gtcagtggta ttgacactgc tgactgcctt 7 80 

gtcctgtctt cagagtctgt cttccctgag aaggcaaagc acctttcttt cttgctgtgc 84 0 

cttacatttt gctggtcaag cctttcagtt tcttttgaca gtttttttta cttctttctt 900 

ttttcaatgt tgctcttacc aagagtagct cctctgcctt ccactttaca catgagagct 960 

gggcgacgca ttcagtccta aggcttttac catcacctct cttggtgttt ttattgtcat 102 0 

ctctaagatc aatgccttta gccttgatca taaccttgaa ctctaatctc aaattctcac 1080 

ttgcctagtg gattgctcca tttagatagt atatagatac cccaacctgg atatgtccta 114 0 

gttttctttc cccttggaac ttaatgcttt tcttgccatc cctgtcacac tcagtggcac 12 0 0 

taccatccac tcggttgccc aagctggctc ttagagttat cctagatgct tgctttgctg 12 60 

ttgcagattt cccacattca actggttatg ttgtcagttc ttccaggtat ggacctctaa 1320 

aataaggctt cctctccatt ccggttgtca ttgcctttgt ccaaacacag cacacaaggc 13 8 0 

cttttacagt tgcacaactc ttcctgtcca tacccaccac accctttccc agctgtaagc 1440 

ttcagatgag ttgcctccaa ccaccatgct cctgtaggcc tggcttgaaa tgcccttctt 15 0 0 

ctgtcacagg gtctggtagt atatcccttg cccttcaaga tttagctaaa atgtgaagct 1560 

ttccttacct gctgggaggt gttctctctt ttctctgtgc tctcagagtc cttagtccat 162 0 

gcctccagta caacgtacat ccacttacat ggtaatttcc tgtttacata cttttcctac 1680 

tcggagtgga gtctgtttct taataatttt gcctctccca tgccctagca cagtgcatcc 174 0 

agcgtatagc cccttattca gttggtagat atttggccac tgttgccttg tgggatcata 18 0 0 

agttctgatg tatttgagaa gaatttctaa aattctgaca aaatcctgaa actcaaatat 1860 

tgacccagac atgagcaatt tgcttttcaa atgctaaggg atttttaatg gatttgcttt 1920 

aattaaatct agcctgtttc taagctttat tcattatttc tccatactca gagcatttct 1980 

ccagattttc taaagaatag aattttattg ctacatatca tcagctatgc ctgctgctat 2040 

ttaattggta tctgaattaa aaggtctggt ttgtccctag agaatcaaat tttttcttca 210 0 

ctcccatatt tcagaacttg atacattttt aggataaacc atgaatgaca cccgtttctt 2160 

ctccctcacc ctcccttccc tcccattttt tttttttttt ttttttagaa gctgaataat 2220 

atcaatgata tcctgaagtc cgtgttcttg atcttcccac atttttgcct gggacgaggg 2280 

ctcatcgaca tggtgaaaaa ccaggcaatg gctgatgccc tggaaaggtt tggtgagtga 2 34 0 
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agcagtggct gtaggatgct ttaatggaga tggcactctg cataggcctt ggtaccctga 24 0 0 

actttgtttt ggaaagaagc aggtgactaa gcacaggatg ttcccccacc cccatgccca 2460 

gtgacagggc tcatgccaac acagctggtt gtggcatggg ttttgtgaca caaccatttg 252 0 

tctgtgtctc tgatagcatt gagaaaagtg aaagggcagt tttgaaggta aggaaaatag 25 80 

tgttatttgc ttggatccac tggctcatgc cactgtctgg gttggttaga agcactggaa 2 64 0 

aagtcaaacc ataactttga gaattaggtg atcagggaat cagaaggaaa gatgcaaact 2 70 0 

ttggctcttt taggcgaatc atgtgcctgc agatgaggtc atttattatc ttttacacag 2760 

tctataaaat tataatgtat tacatctttt tctaccttta gaatggttaa aaatatttct 2820 

ccggtagcca tatgattatt attcatccat tagataatat agtcaaatgg gccatgttat 2880 

ttactgttca tagaagaggg gctttttgca acttgggcta caaaggagat atgtaaggaa 2 94 0 

tttaaggaat ggttacatgg aactagattt aattgaatct agtggtttaa ttgattcact 3000 

aggatatatg ctactgaaag gggaatctgc ttaaagtgct ttctgatatt tattattact 3060 

aaaacttaga atttattaaa aatactgact gtgaaaatta cttgggtcgt ttgccttttt 312 0 

aaaaggattt ttggcatgtc tcattaaaaa aagaaatact agatatcttc agtgaagtta 3180 

caaatcgaat acacattggc tctgaaattc tgattgatac tgggtcataa aaagttttcc 3240 

caaatcagac ttggaaagtg atcactctct tgttactctt ttttccttgt catgggtgat 3300 

agccatttgt gtttattgga agatcggtga attttaagga acataggccc aaatttgagg 3360 

aagggccatg gtttttgatc cctccattct gaccggatct ctgcattgtg tctactaggg 3420 

gagaatcgct ttgtgtcacc attatcttgg gacttggtgg gacgaaacct cttcgccatg 34 8 0 

gccgtggaag gggtggtgtt cttcctcatt actgttctga tccagtacag attcttcatc 354 0 

cLggcccaggt gagctttttc ttagaacccg tggagcacct ggttgagggt cacagaggag 3 60 0 

gcgcacaggg aaacactcac caatgggggt tgcattgaac tgaactcaaa atatgtgata 3 66 0 

aaactgattt tcctgatgtg ggcatcccgc agccccctcc ctgcccatcc tggagactgt 3720 

ggcaagtagg ttttataata ctacgttaga gactgaatct ttgtcctgaa aaatagtttg 3780 

aaaggttcat ttttcttgtt ttttccccca agacctgtaa atgcaaagct atctcctctg 3840 

aatgatgaag atgaagatgt gaggcgggaa agacagagaa ttcttgatgg tggaggccag 3 90 0 

aatgacatct tagaaatcaa ggagttgacg aaggtgagag agtacaggtt acaatagctc 3 96 0 

atcttcagtt tttttcagct ttatgtgctg taacccagca gtttgctgac ttgcttaata 4 02 0 

aaagggcatg tgttcccaaa atgtacatct ataccaaggt tctgtcaatt ttattttaaa 4080 

aacaccatgg agacttctta aagaattctt actgagaatt cttttgtgat atgaattccc 414 0 

attctcgaat actttggttt tatatgctta catttatgtg ttagttatta aaacatacta 4200 

atattgtata tctagtcaaa ctgagtagag agataatggt gatt 4244 

<210> 28 

<211> 5023 

<212> DNA 

<213> Homo sapiens 

<400> 28 

ttttaaaata cctgcaatac atatatatgt tgaatagatg aaaaattatg tagatgataa 60 

tgaatgatac ggttctaaaa agacaggtta aaaagtaagt tcacttttat tttgagcttc 12 0 

agaatcattc agaagccagt cgccacaaac gcagaccaag gctcttggca catcaaatat 18 0 

gcctatggct tagggttatt gacaagtctt atgttgcagt gtatgtggtt tatagtcctg 240 

ccttccacag ttgcttggga gagctgtgag tcactgaggc ttatgaatgt ttacattttg 300 

tttgttgcag atatatagaa ggaagcggaa gcctgctgtt gacaggattt gcgtgggcat 3 60 

tcctcctggt gaggtaaaga cactttgtct atattgcgtt tgtccctatt agttcagact 420 

atctctaccc aatcaagcaa cgatgctcgt taagaggtaa aagtggattt taaaggcttc 480 

tgtatttatg ccaggatgga gcaattagtc atcgagaaga gagggaccct gtatgtcaag 54 0 

agaatgattt cagagaatcc aatacaattt aagaaaaagc atggggctgg gcgcagtgat 6 00 

tcactcctgt aatcccagca ctttgggagg ccgaggtggg cggactcacg aggtcaggag 660 

attgagacca tcctggccaa catggtgaaa ccccatctct actataaata caaaaattag 72 0 
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ctgggcatag tagtgcattc ctgtagtccc agctactcgg gaggctgagg caggagaatt 780 

gcttgaacct aggaggggga ggttgcccag attgcgctgc tgcactccag cctggtgaca 84 0 

gagtgagact catgtcaaca acaaaaacag aaaaagcacg cacatctaaa acatgctttt 900 

gtgatccatt tgggatggtg atgacattca aatagttttt taaaaataga ttttctcctt 96 0 

tctggtttcc gtttgtgttc ttttatgccc ttttgccaga gtaggtggtg caatttggct 102 0 

agctggcttt cattactgtt tttcacacat taactttggc ctcaacttga caactcaaat 1080 

aatatttata aatacagcca cacttaaaat ggtcccatta tgaaatacat atttaaatat 114 0 

ctatacgatg tgttaaaacc aagaaaatat ttgattcttc tctgatattt aagaattgaa 12 00 

ggtttgaggt agttacgtgt taggggcatt tatattcatg tttttagagt ttgcttatac 12 60 

aacttaatct ttccttttca gtgctttggg ctcctgggag ttaatggggc tggaaaatca 132 0 

tcaactttca agatgttaac aggagatacc actgttacca gaggagatgc tttccttaac 13 80 

aaaaataggt gagaaaagaa gtggcttgta ttttgctgca aagactttgt ttttaattta 144 0 

tttaaagaaa taggttgtta tttttgatta cagtggtatt tttagagttc ataaaaatgt 15 0 0 

tgaaatatag taaagggtaa agaagcacat aaaatcatcc atgatttcaa tatctagaga 15 60 

taatcacaat ttacatttcc tttcagtctc attctcttct tttaacagct ttattcaggt 162 0 

ataatttaca tacaatataa tttgcttgtt ttttaagagt ataatttagt gatttttggt 16 8 0 

aaattgagag ttttgcaacc atcaccacaa tccagtttta gaacttttcc atcaccccac 174 0 

atctgtctta tatacacata taaatgtgcc atacaattga gatcatactg tatgtagaat 1800 

ttaaaattag tttttattgt taatgagtgt attatgaata tttcccagtg ggttacattt 1860 

cctaagatgt ggaattttac attgctacat aaaatccccc tatgtacatg tacctataat 1920 

ttatttaata aattccttat aaatgttgga cacattagtt tccatttttc actatgtaaa 1980 

tatgtccctg tatacatctt ttattatttc ctcaggaaca attcctacaa agtaaattgc 2 040 

cctctctaaa gagcatacaa attgactgag ccaccgttag gccattttct gagactgcac 210 0 

aggtcacaaa gcaatctgat ctttgggaat acagctacat tttataggct tcttagataa 2160 

tgttactcta agtactttaa atatgtgggg cttctctggg cttttttttt tttgagacgg 2220 

agtttcactc ttactgccca ggctggagag caatggcgcg accttggctc actgcaacct 22 80 

ccgcctccca ggttcaagcg attctcctgc ctcagcctcc tgagtagctg agattacagg 234 0 

tgcccgccac aatgcctgcc taattttttt gtattttcag tagagatggg gtttcaccat 24 00 

gttggccaga ctggtctcga gctcctgacc tcaggtgatc cacctgcctc agcctcccaa 2460 

agttctggga ttacaggcat gagccactgc gcccggcttc tctggactta ttatgtggag 252 0 

agatagtaca aggcagtggc tttcagagtt ttttgaccat gaccgttgtg ggaaatacat 2580 

tttatatctc aacctagtat gtacacacag acatgtagac acatgtataa cctaaagttt 2 64 0 

cataaagcag tacctactgt tactaattgt agtgcactct gctatttctt attctacctt 2700 

atactgcgtc attaaaaaag tgctggtcat gacccactaa atttatttcc caaaccacta 2 7 60 

atgaacaatg actcacaatt tgaacacact ggacaggggg atagccaata aaattgaaaa 2 82 0 

gagcaaggaa attaatgtat tcatgatctc ctctcctgtc tcttacattt ttgcagtagc 2880 

aatgtaaagg aatcctaaga gaacagacat tctgggaata gcaggcctag cgctgcacaa 2 94 0 

ctgctttcct aggcttgctc ctagtaccaa gctcctgacg catatagcag tggcagtaat 3 0 00 

aaccagccca tagtaaggtt tgtcacaggg actggttgta agaactgatt tgrttggtat 3 060 

agctgtgagg gcctggcacg gtgtccacgt gtgcctcaat cctaattctg aaaaaggctg 312 0 

accctggggg tgctaattag atacacagag aggaatgaat gctgccagaa ggccaagttc 3180 

atggcaatgc cgctgtggct gaggtgcagt catcagtctg gaacgtgaac actgaacttc 3 24 0 

tctcacatgt gattcttcac ttgactggct tcatagaacc ccaaagccac cccaccacca 33 00 

cataaattgt gtctctaggt tctgtgttgc tcacactcaa aatttctggg ccttctcatt 3360 

tggtgcatgt gaatggtgca tatgagtgaa gtctaggatg gggccttagc gttaaagccc 3420 

tggggtagtg tgactgagat tgttggtaaa gaatgtgcag tggttggcat gacctcagaa 34 8 0 

attctgaaat gggactgcac ctgcagactg aagtgttcag agagccaggg aggtgcaagg 3540 

actggggagg gtagaggcag gaaccctgcc tgccaggaag agctagcatc ctgggggcag 3 60 0 

aaaggctgtg ctttcaagta gcagcagatg tattggtatc tttgtaatgg agaagcatac 3660 

tttacaggaa cattaggcca gattgtctaa ccagagtatc tctacctgct taaaatctaa 3 72 0 

gtagttttct tgtcctttgc agtatcttat caaacatcca tgaagtacat cagaacatgg 3 78 0 
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gctactgccc tcagtttgat gccatcacag agctgttgac tgggagagaa cacgtggagt 3 84 0 

tctttgccct tttgagagga gtcccagaga aagaagttgg caaggtactg tgggcacctg 3 90 0 

aaagccagcc tgtctccttt ggcatcctga caatatatac cttatggctt ttccacacgc 3960 

attgacttca ggctgttttt cctcatgaat gcagcagcac aaaatgctgg ttctttgtat 4020 

ctgctttcag ggtggaaacc tgtaacggtg gtggggcagg gctgggtggg cagagaggga 4 0 80 

gtgctgctcc caccacacga gtcccttctc cctgctttgg ctcctcacca gttgtcaggt 4140 

tatgattata gaatctagtc ctactcagtg aaagaacttt catacatgta tgtgtaggac 42 0 0 

agcatgataa aattcccaag ccagaccaaa gtcaaggtgc tttttatcac tgtaggttgg 42 60 

tgagtgggcg attcggaaac tgggcctcgt gaagtatgga gaaaaatatg ctggtaacta 432 0 

tagtggaggc aacaaacgca agctctctac agccatggct ttgatcggcg ggcctcctgt 4380 

ggtgtttctg gtgagtataa ctgtggatgg aaaactgttg ttctggcctg agtggaaaac 444 0 

atgactgttc aaaagtccta tatgtccagg gctgttgtat gattggcttg tcttccccca 450 0 

gggacagcag agcaaccttg gaaaagcaga gggaagcttc tcccttggca cacactgggg 4560 

tggctgtacc atgcctgcag atgctcccaa atagaggcac tccaagcact ttgtttctta 4 62 0 

gcgtgattga ggctggatat gtgatttgat ctttctctgg aacattcttt ctaatcatct 4680 

ttgtgttcat tccctgaaaa tgaagagtgt ggacacagct ttaaaatccc caaggtagca 474 0 

actaggtcat agttccttac acacggatag atgaaaaaca gatcagactg ggaagtggcc 48 00 

cttgaccttt tttcttctgt agataagagc attgatgtta ttacgggaag aagcctttga 4860 

ggcttttatg tattccacct cggtctggaa tttgtttctg taaggctaac agttgcaata 4 92 0 

tactagggta atctgagtga gctggaatta aaaaaaaaaa ggaatttcac cccaatctta 4 98 0 

tactgacttc aatagaggtt tcagacaaaa agttgttttg tat 5023 



<210> 29 

<211> 5138 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<222> (1) . . . (5138) 

<223> n = a, t, o, or g 



annggngnnn tttccccaga 60 

cnaaaaggcc ccncttnttt 12 0 

tcccccaaaa cctattattg 180 

ttcctgcatc aataagacat 24 0 

ctctaattct ttgattcatc 300 

caaagatgct tctgcattta 3 60 

aagacttatt ttattctaat 42 0 

gcggttcttg tggaattgtg 480 

atctcatagg tccgtagtaa 540 

gtagaatatg cgatcatttt 600 

ttattacctg aaattatata 660 

ctctttttca cttaatatat 720 

cttatagctg gatagtattc 78 0 

gacatttgga ttatttccaa 840 

atccctatat ccacatgtac 900 

aagtcatggt gatgttctca 960 



<400> 29 

ngccnngttn aaaangaaaa tttnnnnnaa attnaanntt 
aaaaacnaaa angatttccn cccngggggg ncccccnant 
gnggngaggg aaagnttttt ttggaatttt taatttttgg 
agaatttaat tacataaaaa agtactcaga atatttgagt 
ttataataat gaccttgttt acaaatgaat ttgaaagtta 
aagaaataac tagaatggca agttaaaatt taagctgttt 
aaaacaaatt tatctttgat tttttttccc cccagcaaat 
tacaggatga acccaccaca ggcatggatc ccaaagcccg 
ccctaagtgt tgtcaaggag gggagatcag tagtgcttac 
agtcttgggt tcctcactgt gggatgtttt aactttccaa 
gtaaaaatta gaaaatacag aaaagcaaag agtaaaacaa 
tgcatattct tacaaaaatg caagcccagt ataaatactg 
tgtaaacatt attccaagtc agtgcattta ggtgtcattt 
cattaggata tactcttatt taactattcc cccttttgta 
cttgttcaca attgtaaaca ccactacact gaacagcatc 
ttgtaacaga atacaattcc ctaggaagct ggaatgctgg 
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tggttacaga gaatctctct aaaactaaaa cctctttctg ttttaccgca gtatggaaga 102 0 

atgtgaagct ctttgcacta ggatggcaat catggtcaat ggaaggttca ggtgccttgg 108 0 

cagtgtccag catctaaaaa ataggtaata aagataattt ctttgggata gtgcctagtg 114 0 

agaaggcttg atatttattc ttttgtgagt atataaatgg tgcctctaaa ataaagggaa 12 00 

ataaaactga gcaaaacagt atagtggaaa gaatgagggc tttgaagtcc gaactgcatt 12 60 

caaattctgt ctttaccatt tactggttct gtgactcttg ggcaagttac ttaactactg 1320 

taagagttag tttccctgga agatctacct cctagctttg tgctatagat gaaatgaaaa 13 8 0 

aaatttacat gtgccagtac tggtgagagc gcaagctttg gagtcaaaca caaatgggtt 144 0 

tgcatcctgg ccctaccaat tatgagctct gagccatggg caagtgacta actccctggg 15 0 0 

cctcagtttc tctgtaacat ctgtcagact tcatgggtcc aggtgaggat taaaggagat 1560 

catgtattta cagcacatgg catggtgctt cacataaaat aagtatttag taaatgataa 162 0 

ctggttcctt ctctcagaaa cttatttctg ggcctgccag gggccgccct ttttcatggc 1680 

acaagttggg ttcccagggt tcagtattct tttaaatagt tttctggaga tcctccattt 174 0 

gggtattttt tcctgctttc aggtttggag atggttatac aatagttgta cgaatagcag 18 0 0 

ggtccaaccc ggacctgaag cctgtccagg atttctttgg acttgcattt cctggaagtg 1860 

ttcyaaaaga gaaacaccgg aacatgctac aataccagct tccatcttca ttatcttctc 192 0 

tggccaggat attcagcatc ctctcccaga gcaaaaagcg actccacata gaagactact 1980 

ctgtttctca gacaacactt gaccaagtaa gctttgagtg tcaaaacaga tttacttctc 2040 

^^ggtgtgga ttcctgcccc gacactcccg cccataggtc caagagcagt ttgtatcttg 2100 

aattggtgct tgaattcctg atctactatt cctagctatg ctttttacta aacctctctg 2160 

aacctgaaaa gggagatgat gcctatgtac tctataggat tattgtgaga atttactgta 2220 

ataataacca taaaaactac catttagtga gcacctacca tgggccaggc attttacttg 2280 

gtgcctaatc ctatttaaat tagataaaaa agtaccaaat aggtcctgac acttaagaag 2340 

tactcagtaa atattttctt ccctcttccc tttaatcaag accgtatgtg ccaaagtaaa 2400 

tggatgactg agcagttggt gatgtagggg tggggggcga tatagaaagt cagtttttgg 2460 

ccgggcgtgg tggctcatgc ctgtaatccc agcactttgg gaggctgagg agcaggcaga 2520 

tcatgaggtc aggagatcca gataatcctg gccaacaggg tgaaaccccg tctctactaa 2580 

aaatacaaaa attagctggg catggtggtg cgcacttgta gtcccagcta cttgcgaggc 2 64 0 

tgaggcagga gaattgctcg aacccaggag gtggaggtta cagtgagcca aggtctcgcc 2 70 0 

actgcactcc agcctgggga cagagcaaga ccccatttca aggggggaaa aaaagtctat 2 760 

ttttaagttg ttattgcttt tttcaagtat tcttccctcc ttcacacaca gttttctagt 2 82 0 

taatccattt atgtaattct gtatgctcct acttgaccta atttcaacat ctggaaaaat 2880 

agaactagaa taaagaatga gcaagttgag tggtatttat aaaggtccat cttaatcttt 2 94 0 

taacaggtat ttgtgaactt tgccaaggac caaagtgatg atgaccactt aaaagacctc 3 000 

tcattacaca aaaaccagac agtagtggac gttgcagttc tcacatcttt tctacaggat 3060 

gagaaagtga aagaaagcta tgtatgaaga atcctgttca tacggggtgg ctgaaagtaa 312 0 

agaggaacta gactttcctt tgcaccatgt gaagtgttgt ggagaaaaga gccagaagtt 318 0 

gatgtgggaa gaagtaaact ggatactgta ctgatactat tcaatgcaat gcaattcaat 3240 

gcaatgaaaa caaaattcca ttacaggggc agtgcctttg tagcctatgt cttgtatggc 3300 

tctcaagtga aagacttgaa tttagttttt tacctatacc tatgtgaaac tctattatgg 3 3 60 

aacccaatgg acatatgggt ttgaactcac actttttttt ttttttttgt tcctgtgtat 342 0 

tctcattggg gttgcaacaa taattcatca agtaatcatg gccagcgatt attgatcaaa 3480 

atcaaaaggt aatgcacatc ctcattcact aagccatgcc atgcccagga gactggtttc 3 54 0 

ccggtgacac atccattgct ggcaatgagt gtgccagagt tattagtgcc aagtttttca 3600 

gaaagtttga agcaccatgg tgtgtcatgc tcacttttgt gaaagctgct ctgctcagag 3660 

tctatcaaca ttgaatatca gttgacagaa tggtgccatg cgtggctaac atcctgcttt 3 72 0 

gattccctct gataagctgt tctggtggca gtaacatgca acaaaaatgt gggtgtctcc 3 780 

aggcacggga aacttggttc cattgttata ttgtcctatg cttcgagcca tgggtctaca 3 840 

gggtcatcct tatgagactc ttaaatatac ttagatcctg gtaagaggca aagaatcaac 3900 

agccaaactg ctggggctgc aactgctgaa gccagggcat gggattaaag agattgtgcg 3 96 0 

ttcaaaccta gggaagcctg tgcccatttg tcctgactgt ctgctaacat ggtacactgc 4 02 0 
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atctcaagat gtttatctga cacaagtgta ttatttctgg ctttttgaat taatctagaa 4080 

aatgaaaaga tggagttgta ttttgacaaa aatgtttgta ctttttaatg ttatttggaa 414 0 

ttttaagttc tatcagtgac ttctgaatcc ttagaatggc ctctttgtag aaccctgtgg 4200 

tatagaggag tatggccact gcccactatt tttattttct tatgtaagtt tgcatatcag 4260 

tcatgactag tgcctagaaa gcaatgtgat ggtcaggatc tcatgacatt atatttgagt 4320 

ttctttcaga tcatttagga tactcttaat ctcacttcat caatcaaata ttttttgagt 43 80 

gtatgctgta gctgaaagag tatgtacgta cgtataagac tagagagata ttaagtctca 444 0 

gtacacttcc tgtgccatgt tattcagctc actggtttac aaatataggt tgtcttgtgg 4500 

ttgtaggagc ccactgtaac aatactgggc agcctttttt tttttttttt taattgcaac 4560 

aatgcaaaag ccaagaaagt ttaagggtca caagtctaaa caatgaattc ttcaacaggg 4 62 0 

aaaacagcta gcttgaaaac ttgctgaaaa acacaacttg tgtttatggc atttagtacc 4680 

ttcaaataat tggctttgca gatattggat accccattaa atctgacagt ctcaaatttt 4740 

tcatctcttc aatcactagt caagaaaaaa tataaaaaca acaaatactt ccatatggag 4 8 00 

catttttcag agttttctaa cccagtctta tttttctagt cagtaaacat ttgtaaaaat 4860 

actgtttcac taatacttac tgttaactgt cttgagagaa aagaaaaata tgagagaact 4 92 0 

attgtttggg gaagttcaag tgatctttca atatcattac taacttcttc cactttttcc 4980 

agaatttgaa tattaacgct aaaggtgtaa gacttcagat ttcaaattaa tctttctata 5 04 0 

ttttttaaat ttacagaata ttatataacc cactgctgaa aaagaaacaa atgattgttt 5100 

tagaagttaa aggtcaatat tgattttaaa atattaag 513 8 



<210> 30 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 30 

gtgttcctgc agagggcatg 2 0 

<210> 31 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 31 

cacttccagt aacagctgac 2 0 

<210> 32 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 32 

ctttgcgcat gtccttcatg c 21 

<210> 33 

<211> 21 

<212> DNA 

<213> Homo sapiens 
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<400> 33 

gacatcagcc ctcagcatct t 



21 



<210> 34 

<211> 19 

<212> DNA 

<213> Homo sapiens 



<400> 34 

caacaagcca tgttccctc 19 

<210> 35 

<211> 18 

<212> DNA 

<213> Homo sapiens 

<400> 35 

catgttccct cagccagc 18 

<210> 36 

<211> 19 

<212> DNA 

<213> Homo sapiens 

<400> 36 

cagagctcac agcagggac 19 

<210> 37 
<211> 21 
<212> PRT 

<213> Homo sapiens 
<400> 37 

Cys Ser Val Arg Leu Ser Tyr Pro Pro Tyr GIu Gin His Glu Cys His 

15 10 15 

Phe Pro Asn Lys Ala 
20 

<210> 38 

<211> 14 

<212> DNA 

<213> Homo sapiens 

<400> 38 

gcctgtgtgt cccc 14 

<210> 39 

<211> 14 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> misc_f eature 

<222> (1) . . , (14) 

<223> n = t or c 

<400> 39 

gcctgtgngt cccc 14 

<210> 40 

<211> 45 

<212> DNA 

<213> Homo sapiens 

<400> 40 

aagaagatgc tgcctgtgtg tcccccaggg gcaggggggc tgcct 45 

<210> 41 
<211> 15 
<212> PRT 

<213> Homo sapiens 
<400> 41 

Lys Lys Met Leu Pro Val Cys Pro Pro Gly Ala Gly Gly Leu Pro 
15 10 15 

<210> 42 
<211> 15 
<212> PRT 

<213> Mus musculus 
<400> 42 

Lys Lys Met Leu Pro Val Cys Pro Pro Gly Ala Gly Gly Leu Pro 
15 10 15 

<210> 43 
<211> 15 
<212> PRT 

<213> Homo sapiens 
<400> 43 

Lys Lys Met Leu Pro Val Arg Pro Pro Gly Ala Gly Gly Leu Pro 
15 10 15 

<210> 44 
<211> 5 
<212> PRT 

<213> Caenorhabditis elegans 
<400> 44 

Leu Leu Gly Gly Ser 
1 5 
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<210> 45 

<211> 45 

<212> DNA 

<213> Homo sapiens 

<400> 45 

aagaagatgc tgcctgtgcg tcccccaggg gcaggggggc tgcct 45 

<210> 46 

<211> 14 

<212> DNA 

<213> Homo sapiens 

<400> 46 

gcctacttgc agga 14 

<210> 47 

<211> 14 

<212> DNA 

<213> Homo sapiens 

<400> 47 

gcctacttgc ggga 14 

<210> 48 

<211> 45 

<212> DNA 

<213> Homo sapiens 

<400> 48 

tgggggggct tcgcctactt gcaggatgtg gtggagcagg caatc 4 5 

<210> 49 
<211> 15 
<212> PRT 

<213> Homo sapiens 
<400> 49 

Trp Gly Gly Phe Ala Tyr Leu Gin Asp Val Val Glu Gin Ala lie 
15 10 15 

<210> 50 
<211> 15 
<212> PRT 

<213> Mus musculus 
<400> 50 

Trp Gly Gly Phe Ala Tyr Leu Gin Asp Val Val Glu Gin Ala lie 
15 10 15 

<210> 51 
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<211> 15 

<212> PRT 

<213> Homo sapiens 



<400> 51 

Trp Gly Gly Phe Ala Tyr Leu Arg Asp Val Val Glu Gin Ala lie 
15 10 15 



<210> 52 
<211> 12 
<212> PRT 

<213> Caenorhabditis elegans 



<400> 52 

Phe Met Thr Val Gin Arg Ala Val Asp Val Ala lie 
15 10 



<210> 53 
<211> 45 
<212> DNA 
<213> Homo 



sapiens 



<400> 53 

tgggggggct tcgcctactt gcgggatgtg gtggagcagg caatc 4 5 

<210> 54 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc__f eature 

<222> (1) . . . (25) 

<223> n is a, t, c, or g. 



<400> 54 

teat t octet tgtnngcncn gnncn 2 5 

<210> 55 

<211> 45 

<212> DNA 

<213> Homo sapiens 



<400> 55 

agtagcctca ttcctcttct tgtgagcgct ggcctgctag tggtc 45 

<210> 56 
<211> 15 
<212> PRT 

<213> Homo sapiens 
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<400> 56 

Ser Ser Leu lie Pro Leu Leu Val Ser Ala Gly Leu Leu Val Val 
15 10 15 



<210> 57 
<211> 15 
<212> PRT 
<213> Mus 



musculus 



<400> 57 

Ser Ser Leu lie Pro Leu Leu Val Ser Ala Gly Leu Leu Val Val 
15 10 15 



<210> 58 
<211> 14 
<212> PRT 

<213> Homo sapiens 



<400> 58 

Ser Ser Leu lie Pro Leu Val Ser Ala Gly Leu Leu Val Val 
15 10 



<210> 59 

<211> 15 

<212> PRT 

<213> Caenorhabditis elegans 



<400> 59 

lie Asn Tyr Ala Lys Leu Thr Phe Ala Val lie Val Leu Thr lie 
15 10 15 



<210> 60 

<211> 42 

<212> DNA 

<213> Homo sapiens 



<400> 60 

agtagcctca ttcctcttgt gagcgctggc ctgctagtgg tc 42 

<210> 61 

<211> 25 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<222> (1) . . . (25) 

<223> n is a, t, or g. 

<400> 61 

tgatgaagat gananncngn ngcga 25 
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<210> 62 

<211> 36 

<212> DNA 

<213> Homo sapiens 

<400> 62 

aatgatgaag atgaagatgt gaggcgggaa agacag 3 6 

<210> 63 
<211> 12 
<212> PRT 

<213> Homo sapiens 
<400> 63 

Asn Asp Glu Asp Glu Asp Val Arg Arg Glu Arg Gin 
15 10 

<210> 64 
<211> 12 
<212> PRT 

<213> Mus musculus 
<400> 64 

Asn Asp Glu Asp Glu Asp Val Arg Arg Glu Arg Gin 
15 10 

<210> 65 
<211> 10 
<212> PRT 

<213> Homo sapiens 
<400> 65 

Asn Asp Glu Asp Val Arg Arg Glu Arg Gin 
15 10 

<210> 66 
<211> 15 
<212> PRT 

<213> Caenorhabditis elegans 
<400> 66 

Asp Glu Arg Asp Val Glu Asp Ser Asp Val lie Ala Glu Lys Ser 
15 10 15 

<210> 67 

<211> 30 

<212> DNA 

<213> Homo sapiens 

<400> 67 

aatgatgaag atgtgaggcg ggaaagacag 3 0 
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<210> 68 

<211> 14 

<212> DNA 

<213> Homo sapiens 

<400> 68 

agttgtacga atag 14 

<210> 69 

<211> 14 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
<222> (1) . . . (14) 
<223> n is t or c. 

<400> 69 

agttgtanga atag 14 

<210> 70 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 70 

ggctggatta gcagtcctca 2 0 

<210> 71 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 71 

ggatttccca gatcccagtg 2 0 

<210> 72 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 72 

gacagacttg gcatgaagca 2 0 

<210> 73 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 73 
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gcacttggca gtcacttctg 



20 



<210> 74 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 74 

cgtttctcca ctgtcccatt 20 

<210> 75 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 75 

acttcaagga cccagcttcc 2 0 

<210> 76 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 76 

tcggtttctt gtttgttaaa ctca 24 

<210> 77 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 77 

tcccaaggct ttgagatgac 2 0 

<210> 78 

<211> 19 

<212> DNA 

<213> Homo sapiens 

<400> 78 

ggctccaaag cccttgtaa 19 

<210> 79 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 79 

gctgctgtga tggggtatct 2 0 

<210> 80 
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<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 80 

tttgtaaatt ttgtagtgct cctca 2 5 

<210> 81 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 81 

tagtcagccc ttgcctccta 20 

<210> 82 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 82 

aaaggggctt ggtaagggta 2 0 

<210> 83 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 83 

gatgtggtgc tccctctagc 2 0 

<210> 84 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 84 

caagtgagtg cttgggattg 2 0 

<210> 85 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 85 

gcaaattcaa atttctccag g 21 

<210> 86 

<211> 20 

<212> DNA 

<213> Homo sapiens 
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<400> 86 

tcaaggagga aatggacctg 



20 



<210> 87 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 87 

ctgaaagttc aagcgcagtg 2 0 

<210> 88 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 88 

tgcagactga atggagcatc 2 0 

<210> 89 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 89 

gccaggggac actgtattct 2 0 

<210> 90 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 90 

aggtcctctg ccttcactca 2 0 

<210> 91 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 91 

ccagtgctta cccctgctaa 2 0 

<210> 92 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 92 

cacacaacag agcttcttgg a 21 
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<210> 93 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 93 

acctggaaca ggtgtggtgt 

<210> 94 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 94 

gggctaacat gccactcagt a 

<210> 95 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 95 

gtttgttgca gatggggaag 

<210> 96 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 96 

caccagaaga aggagcatgg 

<210> 97 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 97 

ctggactcgt agggatttgc 

<210> 98 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 98 

gcctgtcaca gagaaatgct t 



<210> 99 
<211> 21 
<212> DNA 



<213> Homo sapiens 



<400> 99 

ttacggaatg atcctgtgct c 

<210> 100 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 100 

agtcaggttt ccggtcacac 

<210> 101 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 101 

ccgttcctta tatcctcagg tg 

<210> 102 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 102 

ccttgtacac actcgcactg a 

<210> 103 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 103 

tgttgtccac aggttccaga 

<210> 104 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 104 

tgaggtttat gggcatggtt 

<210> 105 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 105 



atgtttttcc 

<210> 106 
<211> 20 
<212> DNA 
<213> Homo 

<400> 106 
atctgccctt 

<210> 107 
<211> 20 
<212> DNA 
<213> Homo 

<400> 107 
agggagctgc 

<210> 108 
<211> 24 
<212> DNA 
<213> Homo 



ttggctgtgc 



sapiens 



tcttgtctga 



sapiens 



acagtggata 



sapiens 



<400> 108 

tcactcccat atttcagaac ttga 



<210> 109 

<211> 22 

<212> DNA 

<213> Homo sapiens 



<400> 109 

tgtttattgg aagatcggtg aa 

<210> 110 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 110 

cgttagagac tgaatctttg tcctg 

<210> 111 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 111 

agtcctgcct tccacagttg 



<210> 112 



<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 112 

ggtagttacg tgttaggggc a 

<210> 113 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 113 

caggaacatt aggccagatt g 

<210> 114 

<211> 23 

<212> DMA 

<213> Homo sapiens 

<400> 114 

catgtatgtg taggacagca tga 

<210> 115 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 115 

ctgtttcaaa gatgcttctg c 

<210> 116 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 116 

cctaggaagc tggaatgctg 

<210> 117 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 117 

gggttcccag ggttcagtat 

<210> 118 

<211> 23 

<212> DNA 

<213> Homo sapiens 



<400> 118 

cttgacctaa tttcaacatc tgg 



23 



<210> 119 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 119 

atccccaact caaaaccaca 20 

<210> 120 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 120 

aagtccaatt tagcccacgt t 21 

<210> 121 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 121 

ccagccattc aaaattctcc 20 

<210> 122 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 122 

ggtgcaggtc aatttccaat 2 0 

<210> 123 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 123 

ccccttcacc accattacaa 20 

<210> 124 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 124 

tgtccaagga aaagcctcac 2 0 



58 



<210> 125 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 125 

aggacctctt gccagactca 

<210> 126 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 126 

aggagatgac acaggccaag 

<210> 127 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 127 

cgcacacctc tgaagctacc 

<210> 128 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 128 

acctcactca cacctgggaa 

<210> 129 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 129 

gcctcctgcc tgaaccttat 

<210> 130 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 130 

caaaatcatg acaccaagtt gag 



<210> 131 
<211> 20 
<212> DNA 



<213> Homo sapiens 



<400> 131 
catgcacatg 

<210> 132 
<211> 20 
<212> DNA 
<213> Homo 

<400> 132 
ccttagcccg 

<210> 133 
<211> 21 
<212> DNA 
<213> Homo 

<400> 133 
tgcttttatt 

<210> 134 
<211> 20 
<212> DNA 
<213> Homo 

<400> 134 
cccatgcact 

<210> 135 
<211> 19 
<212> DNA 
<213> Homo 

<400> 135 
aaggcaggag 

<210> 136 
<211> 20 
<212> DNA 
<213> Homo 

<400> 136 
gggatcagca 

<210> 137 
<211> 20 
<212> DNA 
<213> Homo 

<400> 137 



cacacacata 



sapiens 



tgttgagcta 



sapiens 



cagggactcc 



sapiens 



gcagagattc 



sapiens 



acatcgctt 



sapiens 



tggtttccta 



sapiens 



gcttaagtcc 

<210> 138 
<211> 20 
<212> DNA 
<213> Homo 

<400> 138 
attttcctcc 

<210> 139 
<211> 20 
<212> DNA 
<213> Homo 

<400> 139 
tcacagaagc 

<210> 140 
<211> 20 
<212> DNA 
<213> Homo 

<400> 140 
aacagagcag 

<210> 141 
<211> 20 
<212> DNA 
<213> Homo 

<400> 141 
tctgcacctc 

<210> 142 
<211> 20 
<212> DNA 
<213> Homo 

<400> 142 
actggggcca 

<210> 143 
<211> 20 
<212> DNA 
<213> Homo 

<400> 143 
cttccccatc 



cactcctccc 

sapiens 
gcatgtgtgt 

sapiens 
ctagccatga 

sapiens 
ggagatggtg 

sapiens 
tcctcctctg 

sapiens 
acattaatca 

sapiens 
tgcaacaaac 



<210> 144 



<211> 20 
<212> DNA 
<213> Homo 

<400> 144 
gctaaaggcc 

<210> 145 
<211> 20 
<212> DNA 
<213> Homo 

<400> 145 
tcaagtgcat 

<210> 146 

<211> 20 

<212> DNA 

<213> Homo 

<400> 146 
tctgaagtcc 

<210> 147 
<211> 20 
<212> DNA 
<213> Homo 

<400> 147 
caatgtggca 

<210> 148 
<211> 19 
<212> DNA 
<213> Homo 

<400> 148 
gaagctacca 

<210> 149 
<211> 20 
<212> DNA 
<213> Homo 

<400> 149 
catttccccc 

<210> 150 
<211> 20 
<212> DNA 
<213> Homo 



sapiens 



atccaaagaa 



sapiens 



ctgggcataa 



sapiens 



attcccttgg 



sapiens 



tgcagttgat 



sapiens 



gcccatcct 



sapiens 



actgtttcag 



sapiens 



<400> 150 
ccaaggcttt 

<210> 151 
<211> 20 
<212> DNA 
<213> Homo 

<400> 151 
gatccgttta 

<210> 152 
<211> 19 
<212> DNA 
<213> Homo 

<400> 152 
atgcccctgc 

<210> 153 
<211> 20 
<212> DNA 
<213> Homo 

<400> 153 
ctctgcagct 

<210> 154 
<211> 20 
<212> DNA 
<213> Homo 

<400> 154 
tatcaatcca 

<210> 155 
<211> 20 
<212> DNA 
<213> Homo 

<400> 155 
agagtccctg 

<210> 156 
<211> 20 
<212> DNA 
<213> Homo 

<400> 156 
aaggcagtca 



cttcaatcca 



sapiens 



acctgccaac 



sapiens 



caactttac 



sapiens 



gttcccctac 



sapiens 



tggccctgac 



sapiens 



ccctccttct 



sapiens 



gcagtgtcaa 



<210> 157 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 157 

ggggaacatc ctgtgcttag 

<210> 158 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 158 

ccattggtga gtgtttccct 

<210> 159 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 159 

agtcagcaaa ctgctgggtt 

<210> 160 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 160 

attgctccat cctggcataa 

<210> 161 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 161 

tcatggatga ttttatgtgc ttc 

<210> 162 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 162 

9cgtgtggaa aagccataag 



<210> 163 
<211> 20 
<212> DNA 



<213> Homo sapiens 



<400> 163 

gccaatcata caacagccct 

<210> 164 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 164 

tgatcgcata ttctacttgg aaa 

<210> 165 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 165 

tccctttatt ttagaggcac ca 

<210> 166 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 166 

gatcaggaat tcaagcacca a 

<210> 167 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 167 

tgggttccat aatagagttt caca 

<210> 168 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 168 

tgtcagctgt tactggaagt gg 

<210> 169 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 169 



tgtcagctgc tgctggaagt gg 



22 



<210> 170 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 170 

^99^9c:tggc cgaagccaca a 21 

<210> 171 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 171 

aggagctggc tgaagccaca a 21 

<210> 172 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 172 

aatgatgcca ccaaacaaat g 21 

<21Q> 173 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 173 

aatgatgcca tcaaacaaat g 21 

<210> 174 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 174 

gaggtggctc cgatgaccac a 21 

<210> 175 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 175 

gaggtggctc tgatgaccac a 21 

<210> 176 



66 



<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 176 

ttccttaaca gaaatagtat c 

<210> 177 

<211> 21 

<212> DNA 

<213> Honao sapiens 

<400> 177 

ttccttaaca aaaatagtat c 

<210> 178 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 178 

ggaagtgttc caaaagagaa a 

<210> 179 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 179 

ggaagtgttc taaaagagaa a 

<210> 180 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 180 

agtaaagagg gactagactt t 

<210> 181 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 181 

agtaaagagg aactagactt t 

<210> 182 

<211> 21 

<212> DNA 

<213> Homo sapiens 



<400> 182 

gcctacttgc aggatgtggt g 



21 



<210> 183 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 183 

gcctacttgc gggatgtggt g 21 

<210> 184 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 184 

cctcattcct cttcttgtga gcg 23 

<210> 185 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 185 

cctcattcct cttgtgagcg 20 

<210> 186 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 186 

gcaggactac gtgggcttca c 21 

<210> 187 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 187 

gcaggactac atgggcttca c 21 

<210> 188 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 188 

aaaagtctac cgagatggga t 21 



68 



<210> 189 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 189 

aaaagtctac tgagatggga t 

<210> 190 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 190 

ggccagatca cctccttcct g 

<210> 191 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 191 

ggccagatca tctccttcct g 

<210> 192 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 192 

acacaccaca tggatgaagc g 
<210> 193 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 193 

acacaccaca cggatgaagc g 

<210> 194 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 194 

cctggaagaa gtaagttaag t 



<210> 195 
<211> 21 
<212> DNA 



<213> Homo sapiens 
<400> 195 

cctggaagaa ctaagttaag t 

<210> 196 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 196 

gctgcctgtg tgtcccccag g 

<210> 197 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 197 

gctgcctgtg cgtcccccag g 

<210> 198 

<211> 22 

<212> DMA 

<213> Homo sapiens 

<400> 198 

tagccattat ggaattactg ct 

<210> 199 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 199 

tagccattat caattactgc t 

<210> 200 

<211> 26 

<212> DNA 

<213> Homo sapiens 

<400> 200 

gatgaagatg aagatgtgag gcggga 

<210> 201 

<211> 20 

<212> DNA 

<213> Homo sapiens 



<400> 201 



gatgaagatg tgaggcggga 



<210> 202 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 202 

aatagttgta cgaatagcag g 

<210> 203 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 203 

aatagttgta tgaatagcag g 

<210> 204 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 204 

acacgctggg ggtgctggct g 

<210> 205 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 205 

acacgctggg cgtgctggct g 

<210> 206 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 206 

gaccagccac ggcgtccctg 

<210> 207 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 207 

gaccagccac gggcgtccct g 
<210> 208 



<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 208 

cattttctta gaaaagagag gt 22 

<210> 209 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 209 

cattttctta gagaagagag gt 22 

<210> 210 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 210 

gaaaattagt atgtaaggaa g 21 

<210> 211 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 211 

gaaaattagt ctgtaaggaa g 21 

<210> 212 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 212 

cctccgcctg ccaggttcag cgatt 2 5 

<210> 213 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 213 

cctccgcctg ccgggttcag cgatt 25 

<210> 214 

<211> 25 

<212> DNA 

<213> Homo sapiens 



72 



<400> 214 

tatgtgctga ccatgggagc ttgtt 



25 



<210> 215 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 215 

tatgtgctga ccgtgggagc ttgtt 25 

<210> 216 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 216 

gtgacaccca acggagtagg g 21 

<210> 217 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 217 

gtgacaccca gcggagtagg g 21 

<210> 218 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 218 

agtatccctt gttcacgaga a 21 

<210> 219 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 219 

agtatccctc ccttgttcac gagaa 25 

<210> 220 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 220 

ctgggttcct gtatcacaac c 21 



73 



<210> 221 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 221 

ctgggttcct atatcacaac c 21 

<210> 222 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 222 

ggcctaccaa gggagaaact g 21 

<210> 223 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 223 

« 

ggcctaccaa aggagaaact g 21 

<210> 224 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 224 

tttaaagggg gtgattagga 2 0 

<210> 225 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 225 

tttaaagggg ttgattagga 2 0 

<210> 226 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 226 

gaagaaattt gtttttttga tt 22 

<210> 227 
<211> 22 
<212> DNA 
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<213> Homo sapiens 



<400> 227 

gaagaaattt ttttttttga tt 22 

<210> 228 

<211> 21 

<212> DMA 

<213> Homo sapiens 

<400> 228 

gcgggcatcc cgagggaggg g 21 

<210> 229 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 229 

gcgggcatcc tgagggaggg g 21 

<210> 230 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 230 

agggaggggg gctgaagatc a 21 

<210> 231 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 231 

agggaggggg actgaagatc a 21 

<210> 232 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 232 

aggagccaaa cgctcattgt 2 0 

<210> 233 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 233 



75 



^99^gccaaa gcgctcattg t 



21 



<210> 234 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 234 

aagccactgt ttttaaccag t 21 

<210> 235 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 235 

aagccactgt atttaaccag t 21 

<210> 236 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 236 

cgtgggcttc acactcaaga t 21 

<210> 237 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 237 

cgtgggcttc ccactcaaga t 21 

<210> 238 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 238 

tcacactcaa gatcttcgct g 21 

<210> 239 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 239 

tcacactcaa catcttcgct g 21 
<210> 240 



76 



<211> 21 

<212> DNA 

<213> Homo sapiens 



<400> 240 

gcagcctcac ccgctcttcc c 

<210> 241 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 241 

gcagcctcac tcgctcttcc c 

<210> 242 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 242 

agaagagaat atcagaaatc t 

<210> 243 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 243 

agaagagaat gtcagaaatc t 

<210> 244 
<211> 21 
<212> DNA 

<213> Homo sapiens 

<400> 244 

gcgcagtgcc ctgtgtcctt a 

<210> 245 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 245 

gcgcagtgcg ctgtgtcctt a 

<210> 246 

<211> 21 

<212> DNA 

<213> Homo sapiens 



<400> 246 

gatctaaggt tgtcattctg g 

<210> 247 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 247 

gatctaaggt ggtcattctg g 

<210> 248 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 248 

ctcttctgtt agcacagaag aga 

<210> 249 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 249 

ctcttctgtt atcacagaag aga 

<210> 250 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 250 

cattctaggg atcatagcca t 

<210> 251 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 251 

cattctaggg gtcatagcca t 

<210> 252 

<211> 22 

<212> DNA 

<213> Homo sapiens 



<400> 252 

aagtacagtg ggaggaacag eg 



<210> 253 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 253 

aagtacagtg tgaggaacag eg 

<210> 254 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 254 

attcctaaaa aatagaaatg ca 

<210> 255 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 255 

attcctaaaa agtagaaatg ca 

<210> 256 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 256 

ggcccctgcc ttattattac t 

<210> 257 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 257 

ggcccctgcc gtattattac t 

<210> 258 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 258 

tgagagaatt acttgaaccc gg 



<210> 259 
<211> 22 
<212> DNA 



<213> Homo sapiens 



<400> 259 

tgagagaatt gcttgaaccc gg 

<210> 260 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 260 

tttgctgaaa caatcactga c 

<210> 261 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 261 

tttgctgaaa taatcactga c 

<210> 262 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 262 

aacctcagtt ccctcatctg tg 

<210> 263 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 263 

aacctcagtt tcctcatctg tg 

<210> 264 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 264 

ctggacacca gaaataatgt c 

<210> 265 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 265 



ctggacacca aaaataatgt c 



21 



<210> 266 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 266 

tcctatgtgt cctccaccaa t 21 

<210> 267 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 267 

tcctatgtgt gctccaccaa t 21 

<210> 268 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 268 

aagaagtggc ttgtattttg c 21 

<210> 269 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 269 

aagaagtggc ctgtattttg c 21 

<210> 270 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 270 

aactgatttg attggtatag ctg 2 3 

<210> 271 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 271 

aactgatttg gttggtatag ctg 2 3 

<210> 272 



81 



<211> 21 

<212> DNA 

<213> Homo sapiens 



<400> 272 
cagggtccaa 

<210> 273 
<211> 21 
<212> DNA 
<213> Homo 

<400> 273 
cagggtccaa 

<210> 274 
<211> 22 
<212> DNA 
<213> Homo 

<400> 274 
ttgggaggct 

<210> 275 
<211> 22 
<212> DNA 
<213> Homo 

<400> 275 
ttgggaggct 

<210> 276 
<211> 15 
<212> DNA 
<213> Gallus gallus 

<400> 276 
accaggggaa tctcc 

<210> 277 
<211> 15 
<212> DNA 

<213> Gallus gallus 

<400> 277 
accagggaaa tctcc 

<210> 278 

<211> 45 

<212> DNA 

<213> Gallus gallus 



cccggacctg a 



sapiens 



tccggacctg a 



sapiens 



aaggcaggag aa 



sapiens 



gaggcaggag aa 



<400> 278 

cgctacccaa caccagggga atctcctggt attgttggaa acttc 



45 



<210> 279 
<211> 15 
<212> PRT 

<213> Homo sapiens 
<400> 279 

Arg Tyr Pro Thr Pro Gly Glu Ala Pro Gly Val Val Gly Asn Phe 
15 10 15 

<210> 280 
<211> 15 
<212> PRT 

<213> Mus musculus 
<400> 280 

Arg Tyr Pro Thr Pro Gly Glu Ala Pro Gly Val Val Gly Asn Phe 
15 10 15 

<210> 281 
<211> 15 
<212> PRT 

<213> Gallus gallus 
<400> 281 

Arg Tyr Pro Thr Pro Gly Glu Ser Pro Gly lie Val Gly Asn Phe 
15 10 15 

<210> 282 
<211> 15 
<212> PRT 

<213> Gallus gallus 
<400> 282 

Arg Tyr Pro Thr Pro Gly Lys Ser Pro Gly lie Val Gly Asn Phe 
15 10 15 

<210> 283 
<211> 45 
<212> DNA 

<213> Gallus gallus 
<400> 283 

cgctacccaa caccagggaa atctcctggt attgttggaa acttc 45 

<210> 284 

<211> 19 

<212> DNA 

<213> Homo sapiens 



83 



<400> 284 
gcgtcaggga 

<210> 285 
<211> 20 
<212> DNA 
<213> Homo 

<400> 285 
gcgtcaggga 

<210> 286 

<211> 17 

<212> DNA 

<213> Homo 

<400> 286 
ccacttcggt 

<210> 287 

<211> 17 

<212> DNA 

<213> Homo 

<400> 287 
ccacttcgat 



tggggacag 

sapiens 
t tggggacag 

sapiens 
ctccatg 

sapiens 
ctccatg 



